This Week I Learned - Week 40 2021


Fundamental to learning and spaced repetition is the forgetting curve, which states that memory decays exponentially. Online Language Learning Tools like Duolingo have a large amount of learner data, which can be used to optimize their spaced repetition algorithms and at the same time learn about learning. First, let’s have a quick look at the maths here. The forgetting curve is It is given by

p=2Δ/hp=2^{-\Delta / h}

with pp the probability of correctly recalling an item, Δ\Delta the lag time since the last practice, and hh the half-life, measuring the strength of this item in the long-term memory of the learner. Model of forgetting Shown on the left side is the memory model assuming no repetition, immediately after learning ( Δ=0\Delta=0 ) the probability for a successful recall is 100%; the right side shows actual learner behavior, with multiple repetitions (Graphic by Settles and Meeder (2016))

In practice, we have data on the observed recall, and time lack but miss the half-life hh of an item in a learner’s memory. Herefore we compile the estimated half-life h^Θ\hat{h}_{\Theta} for multiple features xx (e.g. the number of correctly recalled ) to h^Θ=2Θx\hat{h}_{\Theta}=2^{\Theta \cdot \mathbf{x}}. The parameter vector Θ\Theta can either be derived from the literature (e.g. for Leitner (1991) Θ={x:1,x:1}\Theta=\left\{x_{\oplus}: 1, x_{\ominus}:-1\right\} with xx_{\oplus} the number of past correct responses and xx_{\ominus} the number of incorrect responses) or empirically estimated based on historical learning data. Settles and Meeder (2016) present half-life regression (HLR) a new spaced repetition algorithm, that can improve the prediction of learners’ recalls.


  • I think we had Side Channel Surveillance and Attacks already in this Blog, where air-gapped computers can leak information by changes in sound, Status LEDs, or other visible influences. In this new surveillance research by MIT, a neuronal network can predict the number and activity of people in a room just by looking at not humanly visible shadows at blank walls. (Sharma et al., 2021)


  • I think a combination of Papers with Code and HuggingFace will shape how we think of AI research in the future. Kardas et al. (2020) paper on extracting the leaderboard automatically from ML papers is worth a read. It combines Table type classification, table segmentation, and combining results in leaderboards into an end-to-end pipeline, which is currently live at the Paper with Code Website. Especially notable is their approach to capturing the context of tables, as relevant data is often spread across different sections, e.g. the dataset description in the running test, the model’s description in the table, and parameters described in the table caption.

  • Beethovens’ unfinished 10th Symphony has been augmented by Machine Learning to form a coherent musical piece. It seems like currently little details on the technology used are given, but if I would guess I would say they used GANs and provided many human labels and validation.

  • Wood et al. (2021) at Microsoft trained computer vision models for face-related tasks with only synthetic data. They use rendered data which allows them to have extremely detailed labels like 3D maps, pixel-perfect segmentation. But this also enables them to work with novel use cases, have high diversity in their dataset to reduce bias, or render specific faces that are error-prone. Note on the scale of this operation: Rendering 100,000 images with 512x512 resolution took 48 hours on 150 NVIDIA M60 GPUs. This means 3,000kWh of electricity or $7,2k to render the images. Example Faces which are all rendered Example Faces which are all rendered


Marcin Kardas, Piotr Czapla, Pontus Stenetorp, Sebastian Ruder, Sebastian Riedel, Ross Taylor, and Robert Stojnic. Axcell: automatic extraction of results from machine learning papers. arXiv preprint arXiv:2004.14356, 2020.

Sebastian Leitner. So lernt man lernen:[angewandte Lernpsychologie-ein Weg zum Erfolg]. Herder, 1991.

Burr Settles and Brendan Meeder. A trainable spaced repetition model for language learning. In Proceedings of the 54th annual meeting of the association for computational linguistics, 1848–1858. 2016. 1 2

Prafull Sharma, Miika Aittala, Yoav Y. Schechner, Antonio Torralba, Gregory W. Wornell, William T. Freeman, and Fredo Durand. What you can learn by staring at a blank wall. 2021. arXiv:2108.13027.

Erroll Wood, Tadas Baltrusaitis, Charlie Hewitt, Sebastian Dziadzio, Thomas J Cashman, and Jamie Shotton. Fake it till you make it: face analysis in the wild using synthetic data alone. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 3681–3691. 2021.