This Week I Learned - Week 30 2021
What is this? An semi-unstructured brain dump of things I read or learn - hopefully weekly.
AI
- Any Detection of Deepfakes can be used to make Deepfakes better. Deepfake uses GANs, with the Generator generating the image and the Discriminator, checking for correctness. In the past Deepfakes could be detected by eyes not blinking or missing/wrong light reflection in eyes (Hu et al., 2020), once we add this to the discriminator, the generator will automatically learn to prevent these “mistakes”.
- There is a Bug in many PyTorch Codes, when overwriting
__getitem__
of the PyTorchdataset
, a random number will stay the same for each process and batch. To solve this, setworker_init_fn
inDataLoader
to e.g.np.random.seed(np.random.get_state()[1][0] + worker_id)
wich will initialize each data worker with a random number seed. - Bert and how transformers change(d) AI , Self-Attention layers replaced RNN-architectures, RNN are/were notoriously difficult to train as the Gradients of “older” Inputs vanish (Vanishing Gradients, LSTM can be a solution, but are complicated).
- Humans do not read letter-wise from front to end but in chunks. Transformers read bidirectional (B in Bert stands for Bidirectional) but it is not technically bidirectional as the models look a the whole text at once.
- Bert is trained by masking (replacing words with [Mask] or random word) and Next Sentence Prediction (NSP)
- Some Transformer Networks like GPT-3 can be few-shot learner or even zero-shot learners, where only the labels are needed to predict the labels of e.g. a sentence. With HuggingFace you can try this out in your browser.
- Is Synthetic Data the next step in AI Development? We can create synthetic data in many different ways: Based on different input resources GANs can already generate realistic-looking images (e.g. Faces by StyleGAN2 or Street Scenes based on GTA). We can use 3D rendering tools like Blender (with the help of BlenderProc], to generate diverse images to train neuronal networks with only synthetic data. This is achieved by placing the main image at a random position, adding other random objects, random backgrounds, and lighting conditions. In the praxis I think this will be used to augment small datasets with synthetic data. Why is this useful? In synthetic data, we always have ground-truth information on e.g. location or segmentation. Real objects can also be “augmented” using a green screen and a small robot, where the robot takes images from different positions, and the green screen is replaced with a real-world background.
- Alias-Free GAN looks like the next Generation of StyleGAN2. It can generate consistent images across different transformations of the latent space.
Software Engineering
- A plaidoyer for better understanding of testing, apart from automatic testing in Test Driven Development.
- Documentation is notoriously overlooked in CI/CD. To reduce developers’ cognitive load and prevent a general knowledge leak, think of writing Documentation as Continous Documentation. It can not be an open feedback loop but needs feedback and measures. Tools like swimm.io can help with providing a closed loop that can be used in most agile environments.
- The SQLAlchemy
association_proxy
is a nifty idea to make databases less verbose, more transparent, and standardized.
Education
- I wrote about why we should think about more guidance in education
Society
- Podcast 80.000 hours with Cal Newport Offices Today are a hyperactive hivemind, where we figure things out on the fly with unscheduled digital messages back and forth. For small groups e.g. 3 People this can work but it does not scale. Instead what happens is the actual work one is doing gets interrupted permanently We know that context/task switching harms performance as each switch comes with a cost. But a the same time knowledge workers frequently check their inboxes - some every 5-7 minutes. Instead of interrupting blocks of parallel work, we want to sequentially handle tasks with minimal/zero interference. This does not mean everyone needs to have prolonged hours of strict concentration. Agile methodologies eg. based on Sprint or Extreme Programming reduce the communication towards structured, time-boxed meetings, clearly and publicly tracking the work of each team member. The change process towards less context-switching can be very difficult, as we have a culture of autonomy where it is up to the worker to figure out how they do their work, but we should empower ourselves to work in a way that uses our brainpower best. This can be achieved once we embrace neuroscience and psychology as guiding priciple to change how we work. If we please our brain, we please ourselves.
Bibliography
Shu Hu, Yuezun Li, and Siwei Lyu. Exposing gan-generated faces using inconsistent corneal specular highlights. arXiv preprint arXiv:2009.11924, 2020. ↩