Christopher Brix

2021-2025
Ph.D. Computer Science, RWTH Aachen University, Germany

For my Ph.D. thesis, I will investigate approaches that allow to generate mathematical proofs for properties of neural network, such as the robustness against adversarial attacks.
2018-2020
M.Sc. Computer Science, RWTH Aachen University, Germany

I've continued to work as a student research assistant, co-authoring two publications. In my master thesis, I proposed and evaluated a new technique to prove the non-existance of adversarial examples in neural networks, using an improved version of symbolic propagation. It was graded 1.0. Final overall grade: 1.5
2014-2018
B.Sc. Computer Science, RWTH Aachen University, Germany

Early on, I decided to specialize in machine learning, and attended several lectures and seminars on this topic. I also started to work as a student research assistant at our local chair for natural language processing. My bachelor thesis "Extension of the Attention Mechanism in Neural Machine Translation" dealt with the application of two-dimensional LSTM cells that could replace the attention mechanism in neural machine translation. It was graded 1.2. Final overall grade: 1.6
2006-2014
Konrad-Heresbach-Gymnasium Mettmann, Germany

In high school, I've focussed on math, chemistry and computer science. As an additional project, I've teamed up with 2 friends and programmed an online browser-based economics simulation. In my free time, I became a tutor and mediator for younger students. Final grade: 1.0

Two-Way Neural Machine Translation: A Proof of Concept for Bidirectional Translation Modeling using a Two-Dimensional Grid

P. Bahar, C. Brix, H. Ney. IEEE Spoken Language Technology Workshop 2021

Based upon our previous work on two-dimensional LSTMs in machine translation, we investigate a new architecture that allows to translate in both directions for a given language pair. As opposed to other multilingual setups, the translation direction is defined by the network architecture itself, instead of relying on additional tokens that indicate the translation direction.

Read the paper

Debona: Decoupled Boundary Network Analysis for Tighter Bounds and Faster Adversarial Robustness Proofs

C. Brix, T. Noll. Unpublished

The summarised results of my master thesis, which provided an improvement for the computation of mathematical guarantees about the correctness of neural networks in the face of adversarial manipulation of the input. We provide a detailed comparison between our baseline "Neurify" and the new open source software toolkit "Debona" (total runtime reduction: 94%).

Read the paper

Proving Non-Existence of Imperceptible Adversarial Examples in Deep Neural Networks Using Symbolic Propagation With Error Bounds

C. Brix (2020). Master thesis. RWTH Aachen University, Germany

Because neural networks are often used in safety-critical real-world applications, it is important to guarantee the correctness of their computations. Unfortunately, neural networks commonly fail for "adversarial examples", where a correctly classified input is misclassified once minor modifications are applied. For my master thesis, I proposed, implemented, and evaluated an improved technique for computing mathematical proofs about the amount of modifications that could be applied before a misclassification occurs. Compared to "Neurify", my updated open source toolkit "Debona" decreases the necessary runtime by up to 94%, which allows the analysis of larger and more complex networks.

Read the thesis

Successfully Applying the Stabilized Lottery Ticket Hypothesis to the Transformer Architecture

C. Brix, P. Bahar, H. Ney. To be published in the proceedings of the 2020 Annual Conference of the Association for Computational Linguistics, acceptance rate 25.2%

The stabilized lottery ticket hypothesis states that the (almost) untrained network already contains sparse sub-networks that can be trained to reach the same accuracy as the dense model. We test this hypothesis on the WMT 2014 En->De and En->Fr task using the transformer architecture, and compare it to other pruning techniques. Furthermore, we present a new combination of pruning techniquies that generates better performance than any individual technique.

Read the paper

Towards Two-Dimensional Sequence to Sequence Model in Neural Machine Translation

P. Bahar, C. Brix, H. Ney. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3009-3015. Association for Computational Linguistics, acceptance rate 23.2%

We describe a novel network architecture: A two-dimensional LSTM can process two separate sequences at the same time. By feeding in the source sentence, as well as the partial target hypothesis, the 2D-LSTM can internally attend to the relevant source positions. Therefore, we are able to use the final 2D-LSTM output directly to predict the next target token, and do not need an explicit decoder.

Read the paper

Extension of the Attention Mechanism in Neural Machine Translation

C. Brix (2018). Bachelor thesis. RWTH Aachen University, Germany

For my bachelor thesis, I've analyzed several alternatives to the attention mechanism which usually determines which source positions to attend to. Besides a technique to re-encode the source sentence dependend on the partial target hypothesis, I've experimented with the usage of one- and two-dimensional LSTMs that summarize the source sentence. I also report first results for a network which solely consist of a two-dimensional LSTM and no additional decoder structure. This work later resulted in the EMNLP paper. If you are interested in the gradient calculations for multi-dimensional LSTMs, have a look!

Read the thesis

Empirical Investigation of Optimization Algorithms in Neural Machine Translation

P. Bahar, T. Alkhouli, J.-T. Peter, C. Brix, and H. Ney. The Prague Bulletin of Mathematical Linguistics, 108(1):13–25, 2017

In order to optimize neural networks, there are a lot of different optimization algorithms. In this paper, we compare and evaluate them with respect to convergence speed, translation quality, and training stability. We also report results on combinations of multiple algorithms.

Read the paper

Sep 2020 -
Jan 2021
Research SWE Intern, Google Zurich (remotely)

During my internship, I investigated the application of a fine-tuned BERT model for named entity recognition. To this end, I analyzed multiple datasets, wrote an augmentation script and evaluated a variety of different training techniques. Furthermore, I set up a CI pipeline on GitHub to ensure a high code quality. As a side project, I managed a program that connected interns with each other for 1:1 chats, setting up approximately 700 meetings.
Nov 2016 -
Feb 2020
Student Research Assistant, RWTH Aachen University

At the Human Language Technology and Pattern Recognition chair (Professor Ney), I mainly focused on analyzing alternatives for the attention mechanism in NMT. I investigated the usage of two-dimensional LSTMs for machine translation tasks, implementing the corresponding code in C/CUDA and integrating it into the inhouse machine learning framework (RETURNN). Furthermore, I independently worked on a paper about comparisons and improvements of sparsity in complex network architectures.
Apr 2016 -
Sep 2016
Tutor for the lecture "Datastructures and Algorithms", RWTH Aachen University

In weekly meetings, I explained the content of the lecture to a group of about 20 students. I was responsible for grading their homework and answering any questions. Topics included runtime analysis, formal proof of correctness, and sorting algorithms.

2021
Invited Talk at Google Munich

One of the teams working on privacy in machine learning invited me to talk about the current state of the art in network verification, as well as results from my master thesis.
2020
ICT Young Researcher Award

For my research work, especially on novel network architectures in machine translation, I received the ICT Young Researcher Award from the Information and Communication Technology Committee of the RWTH Aachen University. The price is endowed with 1.500€.
2020
Invited Talk at Fraunhofer IAIS

The Fraunhofer Institute for Intelligent Analysis and Information Systems invited me to give a presentation. Topics included both my own research, as well as an overview over current research topics in neural machine translation. Slides: Current State of Research in NMT (German version).
2019
LxMLS, Monitor

I helped to organize the Lisbon Machine Learning Summer School. During daily practical lab sessions, I supervised and taught participating students, to help them understand the concepts and techniques from the lectures.
2019
Google NLP Summit

As one of a limited number of master students, I was accepted to the 3rd NLP Summit organized by Google. In Zurich, we attended talks of Google researchers about cutting edge NLP technology. We discussed current open problems and presented our own research ideas. Total number of accepted attendees: 90.
2018
LxMLS, Participant

I participated in the Lisbon Machine Learning Summer School. We learned about classification, structured prediction of sequences, trees and graphs, parsing, and deep learning. Theoretical lectures were combined with practical lab sessions and talks from high-profile professionals. Acceptance rate < 40%.
2016, 2018
Supported by the scholarship "Deutschlandstipendium"
2016, 2017
Member of the Dean's List

brix@cs.rwth-aachen.de

Christopher Brix
Martin-Luther-Str. 14
52062 Aachen
Germany

Christopher Brix

Latest publications:

Education.

Ph.D. Computer Science, RWTH Aachen University, Germany

M.Sc. Computer Science, RWTH Aachen University, Germany

B.Sc. Computer Science, RWTH Aachen University, Germany

Konrad-Heresbach-Gymnasium Mettmann, Germany

Publications.

Two-Way Neural Machine Translation: A Proof of Concept for Bidirectional Translation Modeling using a Two-Dimensional Grid

Debona: Decoupled Boundary Network Analysis for Tighter Bounds and Faster Adversarial Robustness Proofs

Proving Non-Existence of Imperceptible Adversarial Examples in Deep Neural Networks Using Symbolic Propagation With Error Bounds

Successfully Applying the Stabilized Lottery Ticket Hypothesis to the Transformer Architecture

Towards Two-Dimensional Sequence to Sequence Model in Neural Machine Translation

Extension of the Attention Mechanism in Neural Machine Translation

Empirical Investigation of Optimization Algorithms in Neural Machine Translation

Employment.

Research SWE Intern, Google Zurich (remotely)

Student Research Assistant, RWTH Aachen University

Tutor for the lecture "Datastructures and Algorithms", RWTH Aachen University

Highlights.

Invited Talk at Google Munich

ICT Young Researcher Award

Invited Talk at Fraunhofer IAIS

LxMLS, Monitor

Google NLP Summit

LxMLS, Participant

Supported by the scholarship "Deutschlandstipendium"

Member of the Dean's List

Contact.