<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">This is an announcement of Gil Rosenthal's MS Presentation</span></div><div><br></div><div>Gil Rosenthal is a student in the Bx/MS program.</div><div><br></div><div>———————————————————————————————————————————</div><div><br></div><div><b>Date:</b> Wednesday, May 17, 2023</div><div><br></div><div><b>Time:</b> 3 PM, CST</div><div><br></div><div><b>Location:</b> JCL 298</div><div><b><br></b></div><div><b>Zoom: </b><a href="https://urldefense.com/v3/__https://uchicago.zoom.us/j/91727524850?pwd=d3ZTZjdURGpVMXhsWk9BM2FZVWVKQT09__;!!BpyFHLRN4TMTrA!_VsA5enRfQfwQHgtyYAffUj4Q0u07dTGTa0kqMp17kXz2eK90oSr4GAyuO-BxeKnMJDEf5D3S7CprOuvmrPQY-4BU4HX$">https://uchicago.zoom.us/j/91727524850?pwd=d3ZTZjdURGpVMXhsWk9BM2FZVWVKQT09</a></div><div><br></div><div><b>M.S. Candidate:</b> Gil Rosenthal</div><div><br></div><div><b>M.S. Paper Title:</b> Machina Cognoscens: Neural Machine Translation for Latin, a Case-Marked Free-Order Language</div><div><br></div><div><div style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);"><b>Advisor:</b> Allyson Ettinger</div><div style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);"><br></div><div style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);"><b>Committee Members:</b> Allyson Ettinger, Jeff Tharsen, and Chenhao Tan</div></div><div><br></div><div>———————————————————————————————————————————</div><div><br></div><div><b>Abstract:</b></div><div><br></div><div>Neural methods have brought a revolution in automated Machine Translation processes, with most highly-spoken languages having robust training datasets and near-human performance. However, these methods have lacked the same effect in Case-Marked Free-Order languages. A free-order language is one that has no specific word order, i.e. the subject, verb, and object can be anywhere in the sentence without violating the rules of the grammar. Case-marked means that additional information about the word, such as the number and function, are encoded in morphological features of the word, such as case or conjugation. As a target language, we use Latin, which is a FOCM language with extremely poor machine translation tools existing. We have created a first-of-its-kind Parallel Translation Dataset consisting of roughly 100k pairs, and evaluated its performance in Neural Machine Translation, with novel methods of preprocessing to encode morphology, and new approaches to transfer learning. We achieve a best performance BLEU of 22.4 on the test dataset, which beats the current State of The Art Google Translate model by over 4.2 BLEU.</div><div><br></div><div>———————————————————————————————————————————</div><div><br></div><br><div></div></body></html>