---
title: "Why lasR?"
author: "Jean-Romain Roussel"
output:
  html_document:
    toc: true
    toc_float:
      collapsed: false
      smooth_scroll: false
    toc_depth: 2
vignette: >
  %\VignetteIndexEntry{1. Why lasR?}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup, echo = F}
suppressPackageStartupMessages(library(lasR))
```

## Rationnale for `lasR` vs. `lidR`

Do we need a new package? The short answer lies in the following graph. The x-axis represents the time to perform three different rasterizations (a CHM, a DTM, and a density map), and the y-axis represents the amount of RAM memory used for `lidR` and `lasR` (more details in the [benchmark](benchmarks.html) vignette). `lasR` is intended to be much more efficient than `lidR` both in terms of memory usage and computation times.

```{r, fig.width=6, fig.height=4, warning=F, echo=-1, echo=FALSE, fig.align = 'center'}
col1 = rgb(0, 0.5, 1, alpha = 0.5)
col2 = rgb(1, 0, 0, alpha = 0.5)

par(mar = c(4,4,1,1))

m_lasr = c(0, 82.03, 170.51, 354.01, 531.90, 711.60, 895.17, 1074.86, 1214.60, 1259.20, 1297.35, 1297.35, 435.67,550.40, 692.43, 831.39, 971.39, 1113.44, 1282.57, 1285.92, 1345.21, 
1345.21, 498.85, 559.43, 697.85, 837.33, 977.29, 1127.08, 1258.82, 1240.92, 1240.92, 1240.92, 480.92, 606.07, 749.79, 887.94, 1026.13, 1169.48, 1358.71, 1378.30, 1423.42, 1423.42, 
554.91, 560.07, 0)
t_lasr = 1:length(m_lasr)*2/60

m_lidr = c(0, 104.86, 456.99, 802.07, 1116.71, 1765.46, 2116.21, 2502.28, 2276.53, 2666.14, 2986.86, 2300.90, 2686.19, 2989.73, 2004.21, 2579.39, 3195.28, 2827.48, 3512.35, 4633.82, 
5221.53, 5639.37, 3357.23, 4434.14, 5110.58, 4302.41, 3457.53, 4052.46, 4702.15, 4430.27, 4585.12, 4937.42, 5243.30,4713.65, 5063.31, 5376.67, 2866.46, 3207.25, 3459.13, 3188.25, 
3523.44, 3733.16, 0)

t_lidr = 1:length(m_lidr)*8/60

x = t_lidr
y = m_lidr/1000

par(mar = c(4,4,1,1))

plot(x, y, type = "n", ylim = c(0, max(y)), ylab = "Memory (GB)", xlab = "Time (min)")

polygon(c(x, rev(x)), c(rep(0, length(x)), rev(y)), col = col1, border = NA)
lines(x, y, col = "blue", lwd = 2)

x = t_lasr
y = m_lasr/1000

polygon(c(x, rev(x)), c(rep(0, length(x)), rev(y)), col = col2 , border = NA)
lines(x, y, col = "red", lwd = 2)

legend("topleft", legend = c("lidR", "lasR"), fill = c( col = col1, col2), border = NA)
```

The second issue is the absence of a powerful pipeline engine in `lidR`. Performing a task as simple as extracting and deriving metrics for multiple inventory plots from a non-normalized collection of files is not that easy in `lidR`. It is straightforward if the point cloud is normalized, but if not, users must write a complex custom script. With the introduction of real pipelines, `lasR` enables users to do more complex tasks in an easier way (see [the tutorial](tutorial.html) vignette as well as [the pipeline](pipeline.html) vignette).

Last but not least, I have almost a decade of additional experience with R, C++, point cloud processing, and a lot of feedback compared to when I started the creation of `lidR`. I was simply not technically capable of writing `lasR` ten years ago!

## Main differences between `lasR` and `lidR`

### Pipeline

`lasR` introduces a versatile pipeline engine, enabling the creation of more complex processing pipelines. Users can simultaneously create an ABA and compute a DTM in one read pass, leading to a significant speed-up.

### Data loading

Unlike `lidR`, `lasR` does not load lidar data into a `data.frame`. It is designed for efficient data processing, with memory management at the C++ level. Consequently, there is no `read_las()` function. Everything is internally and efficiently stored in a C++ structure that keeps the data compact in memory. However, some entry points are available to inject user-defined R code in the C++ pipeline.

### Dependencies

`lasR` has only 0 dependency. It doesn't even depend on `Rcpp`. `lasR` does not use `terra` and `sf` at the R level for reading and writing spatial data; instead, it links to `GDAL`. If `terra` and `sf` are installed, the output files will be read with these packages. Due to the absence of dependency on R package and the non-loading of data as R objects, there is also no dependency on `rgl`, resulting in no interactive 3D viewer like in `lidR`.

### Code

`lasR` is written 100% in C++ and contains no R code. It utilizes the source code of `lidR` with significant improvements. The major improvements observed in the [benchmark](benchmarks.html) are not so much in the source code but rather in the organization of the code, i.e., no longer using `data.frame`, memory management in C++ rather than R, no processing at the R level, pipelines, and so on.

## Should I use `lidR` or `lasR`?

The question is actually pretty simple to answer. If you want to explore, manipulate, test, try, retry, and implement new ideas you have in mind, use `lidR`. If you know what you want, and what you want is relatively common (raster of metrics, DTM, CHM, tree location), especially if you want it on a large coverage, use `lasR`.

### Example 1

I received 500 km² of data, and I want a CHM and a DTM.

&rarr; Use `lasR` to compute both as fast as possible.

### Example 2

I want to segment the trees, explore different methods, and test different parameters on small plots. Maybe I will integrate a custom step, but it's an exploratory process.

&rarr; Use `lidR`.

### Example 3

I want to extract circular ground inventories and compute metrics for each plot. 

&rarr; If the dataset is already normalized, you can use either `lasR` or `lidR`; this is pretty much equivalent. `lidR` will be easier to use; `lasR` will be a little bit more efficient but more difficult to use (yet the [pipeline vignette](pipeline.html) contains a copy-pastable code for that). If your dataset is not normalized, `lasR` will be much simpler in that case, thanks to the pipeline processor that allows adding a normalization stage before computing the metrics.

### Example 4

I want to create a complex pipeline that computes the local shape of the points to classify roofs and wires in the point cloud. Then using a shapefile, I want to classify the water in the point cloud. To finish, I want to write new classified LAS files. 

&rarr; Use `lidR`. `lasR` does not have so many tools. `lasR` is not `lidR`; it is much more efficient but less versatile and has fewer tools.

### Example 5

I want to find and segment the trees with a common algorithm. Nothing fancy. I want to do that on 100 km² or more. 

&rarr; Use `lasR`. `lidR` will probably fail at doing it.