x-language coding
A simple demonstration of coding with Python and R within the same IDE using Quarto.
Quarto
Quarto weaves together narrative text and code to produce elegantly formatted output. Most interestingly, Quarto executes code written in different languages. The following gives an illustration of cross-language coding in Quarto (here using R and Python - other languages, e.g. Julia, are supported as well).
Cross-language coding
R
R code chunks are written and executed just as we would do it in R Markdown. So, for users of R Markdown this should be very familiar.
if (!require("pacman")) install.packages("pacman")
::p_load(tidyverse, palmerpenguins, reticulate, viridis) pacman
data(penguins, package = "palmerpenguins")
<- penguins[complete.cases(penguins),]
penguins
ggplot(penguins,
aes(x = flipper_length_mm, y = bill_length_mm)) +
geom_point(aes(color = species, shape = species)) +
scale_color_viridis(discrete = TRUE) +
labs(
title = "Flipper and bill length",
subtitle = "Dimensions for penguins at Palmer Station LTER",
x = "Flipper length (mm)", y = "Bill length (mm)",
color = "Penguin species", shape = "Penguin species"
+
) theme_light()
Python
The beauty comes into play when we add Python chunks of code.
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report
penguins = r.penguins # here the R tibble turns into a pandas DataFrame
penguins.head(3)
species | island | bill_length_mm | ... | body_mass_g | sex | year |
---|---|---|---|---|---|---|
Adelie | Torgersen | 39.1 | ... | 3750 | male | 2007 |
Adelie | Torgersen | 39.5 | ... | 3800 | female | 2007 |
Adelie | Torgersen | 40.3 | ... | 3250 | female | 2007 |
We simply switched from R to Python working on the penguins data set.
Now, let us train a simple classifier using the scikit-learn
module in Python.
y = penguins['species']
X = penguins['bill_length_mm','bill_depth_mm','flipper_length_mm','body_mass_g']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=333)
classifier = RandomForestClassifier().fit(X_train, y_train)
y_test_pred = classifier.predict(X_test)
print(classification_report(y_test, y_test_pred))
Precision | Recall | F1-Score | Support | |
---|---|---|---|---|
Adelie | 1.00 | 0.95 | 0.98 | 22 |
Chinstrap | 1.00 | 1.00 | 1.00 | 15 |
Gentoo | 0.97 | 1.00 | 0.98 | 30 |
Average/Total | 0.99 | 0.99 | 0.99 | 67 |
Conclusion
Great, with Quarto it is possible to write reports, presentations, … using different languages. Here, we have made use of R's tidyverse
for data preprocessing and visualization and Python's Machine Learning frameworks (scikit-learn
in this example). So, Quarto really
enables you to pick the best of both worlds.