Why is Julia meant for FSPM?

Many of you must have heard of the Julia programming language by now, perhaps from me or @rvezy or someone else, but may be wondering was is all the fuss about. There is plenty of material online that can explain the details better than I could ever do, but I thought about sharing why I think Julia is the perfect language for FSPM. My idea is that anyone who has used Julia can also share their experience below (or if you are doubting yourself, ask questions about it!).

Below an elevator pitch (tldr) and then you can stay in the elevator until the 50th floor and read the rest :sweat_smile:.

tldr: It is very easy to write code in Julia and make it run as fast as your machine allows, even if you are not an experienced software developer. Also, Julia is designed for collaborative projects and for maximum code reuse with minimum hassle. These two aspects challenge the classic developer-user dualism that I believe is at the root of the sustainability crisis in scientific software, allowing for cutting-edge software that is easier and cheaper to maintain and more accessible to the average user :smiley:.

Julia: A solution to the two language problem

Firstly, the Julia programming language is the first serious attempt at solving the two language problem. This is a phenomenon whereby programming in may areas (e.g., research) is split into two roles: users that employslow, dynamic, interactive languages (R, Python, Matlab) that are easy to learn and use for prototyping code and developers that used fast, compiled, static languages to implement platforms, algorithms, models and anything that had to scale and run fast (C/C++, Fortran, Java). Although there have been attempts to speed up classic dynamic languages there is still is a large performance gap between languages like Matlab, Python or R and the potential performance of a really fast language (Julia Micro-Benchmarks).

The consequence is that, for applications where significant engineering and outsourcing to compiled libraries is possible, the user will just interact with an API in Python or R without sacrificing too much (good examples would be deep learning with Keras or Bayesian modeling with brms). However, the moment the user needs that one extra feature that the developer did not account for, they are out of luck, as it may not even be possible to add it without significant engineering effort (good examples are anything involving non-linear or “black box” models: non-linear optimization, non-linear solvers, (partial) differential equations, ray tracing).

Julia is a fast, dynamic, interactive language designed specifically to get rid of the two language problem. You can write your code in Julia and with very little effort on your part you will get the performance of native C code (at that point the limitation is your skill as a programmer). Not only that, Julia tends to produce high performance with very little code that is as readable as Python’s code and the performance improvements follow a clean and well-document workflow. And this applies to any code, no matter the domain or whether it is a linear or non-linear problem. This challenges the classic paradigm: coders can become more like developers with much less time investment, and that includes maintaining the code (which is what kills most software…). This blog post illustrates this nicely - My Target Audience. This shift in the paradigm is important: the developer - user paradigm is important because that model may work in industry but not in academia due to ever increasing cost of maintaining and improving software that requires experienced software developers while at the same time always limiting the options users have.

This developer-user dualism is very strong in the FSPM community. In some cases, the FSPM software is not even presented as an API within a scientific languages (like Keras or brms that I mentioned above) but as a standalone “studio” or “platform” that is meant to be self-contained and offer everything the user requires. This can lead to workflow that are hard or impossible to reproduce (let alone automate) without significant effort from the user. Inevitably, this also means the user will miss features that are needed and cannot leverage other packages like one would do in R, Julia or Python. This is indicative that FSPM has a strong developer-user culture and two-language problem, hence the potential of Julia.

Julia: Decentralized and collaborative

If you use Julia you will hear about its most important feature as a language: multiple dispatch. This may sound like an obscure technical detail but it actually enables another paradigm shift in the Julia community. Rather than the old paradigm of large, monolithic packages with complex class hierarchies, Julia favors a decentralized, functional approach to programming. The classic paradigm emerged out of object-oriented programming with reference classes (as defined in C++, later Java and Python, which have been the languages used to teach computer scientists). In this old paradigm, the emphasis is on defining classes where the data structure and functionality (methods) are bound to each other and code reuse is mostly achieved by inheriting data and methods from other classes.

In Julia, emphasis is on generic programming and interfaces: focus on functionality or on data structures, but do not marry one to the other. This leads to a horizontal rather than vertical organization of data types and methods which enables a massive amount of code reuse. Do you need to add functionality to an existing data type? Just define a method for it without ever touching the source code? Do you want your function to apply to any data type? Just leave the type of input undefined and Julia will compile a specialized method when you call the function (like any function in Python of R but actually fast).

An example of this amazing feature is how you can take any function that was written to work with “normal” arrays of numbers and, if it is written in a certain style (which is often required for performance) you will be able to run the code on the cpu or the gpu just by changing the type of array (e.g., just replacing Array with CuArray in your code). More examples include performing calculations with physical units (we all have been bitten by that one…), uncertainty propagation, automatic differentiation, etc. In all these cases, by writing the function as generically as possible a simple change in the type of inputs you will greatly extend what you can do without changing the source code. Try doing any of this with a library that contains 100 classes all tangled up in an inheritance tree.

Finally, collaboration in Julia is not only enabled by the way code can be combined but also by the way developers interact. Every Julia package is a git repository, it has to be, there is no option. Registering a package is pretty much a matter of adding a Github repository address onto a central registry (correct me if I am wrong here), which itself is just a Github repository (try to register an R package in CRAN and it will make you cry…). The entire package development is meant for collaborative workflow (e.g., documentation and unit testing is designed to work with continuous integration from the start) and anyone is welcome if they have good ideas, regardless of your status or institution, as long as you bring good ideas to the table. Also, most Julia packages have a vanilla MIT license which basically means: “do whatever you want with the code and don’t bother me about it”.

If you got here, congratulations, you are either enthusiastic about Julia or you have a lot of free time. Anyhow thanks for reading and hope to see your reply below :smiley:

6 Likes

This is incredibly well put !

Just a note on the package registration process: For sure I agree that it’s way simpler than registering an R package on CRAN :slight_smile:. I can confirm that it’s trivial as long as you follow the basic rules for packages (see: 5. Creating Packages · Pkg.jl). And I have to say that it is almost impossible to make it wrong when using PkgTemplates.jl. Then registering the package is just a matter of adding a comment below your commit: @JuliaRegistrator register. You can see an example here on PlantBiophysics.jl’s repo. And if that’s too cumbersome, there’s even a github action for that ! So then it’s just a click of a button.

2 Likes

Thanks Alejandro, this is very convincing! I really like the idea of having a language that can bridge the two language problems because this is indeed (one of the ) biggest challenges. There are many “developers” that work on FSPM but there are also many ( and maybe even more) “users”. And for users the program needs to be clear, simple and there should be a good “help -system”.
Personally I like the integrated programs like GroIMP and Rstudio because they have a set of built in functions that makes it easy to work and there is already a layout that provides an overview.

However, GroIMP is impossible if you dont know how to do something because the help functions/options are difficult or absent. R(studio) on the other side has such a big community by which you can google any question and someone has written the answer already for you. I think this is crucial for “users”.

So my point is; can you create a platform using Julia that has a good “help-section” where you can ask a questions " how to…" and you can find the answer? Because this would make it for “users” easier to use and this way they can also learn.

2 Likes

I am not sure what you with the “how to…” help. I document all the functions in the packages with as much detail as possible but I am also writing a lot of tutorials on how to do different things with VPL, maybe that is what you meant? For asking questions then this right here is the right place, just look for the relevant sub-category in the Software section :slight_smile:

4 Likes

I agree with Alejandro, this is precisely why this forum could be helpful, we can finally centralise the questions about a software in particular, benefit from the community knowledge, and most importantly have the discussions searchable via google or the search bar in discourse, which is very good at finding what you are searching for ! We must now broadly communicate with our users to ask questions on the discourse rather than by mail or anything else. This way the community can benefit from the cumulated knowledge.

2 Likes

Just came across this forum and thread. I’ve been working on a crop modeling framework called Cropbox written in Julia for years now and definitely agree with your opinions here. Julia has a great potential although there are still many areas that need improvements.

FSPM was not a primary goal for my project, but we still had some ideas for incorporating concepts from FSPM and managed to build a proof of concept 3D root model there. More plans ahead and I’m looking forward to sharing them with you and also learning more from you all.

2 Likes

Hi Tom!
This is fantastic news
I suggest you get in contact with @rvezy and @AlejandroMorales because they have a similar story. They were also working independently on FSPM Julia packages, and now they decided to synergize to avoid reinventing the wheel in parallel. That’s also one of the reasons behind the creation of the FSPM forum.

It would be really nice if you could combine your works!