My work schedule and inability to sign up to some sessions prevented me to attend most NYCDH week events. However, I tried to make the most of it 🙂
I have signed up to attend the “Introduction to Arabic Text Processing with Python and CAMeL Tools” workshop. Although I was not able to attend due to work, I still received the slides and some useful links sent by Salam Khalifa and Ossama Obeid who led the workshop. The topic was of special interest to me as a native Arabic speaker, I have worked with Arabic test parsing and processing and found it to be challenging because the Arabic Alphabet (Abajadia) and its orthographic ambiguity made any existing tools hard to adapt. I learnt a lot by looking at the slides and the python collab notebook used in the session. CAMeL tools, built by the Computational Approaches to Modeling Languages Lab (CAMeL) an NYU Abu Dhabi, is an open-source python toolkit for Arabic Natural Language Processing.
CAMeL tools support different forms of processing of Arabic text:
- Preprocessing (cleaning and preparing the text for analysis): an example of this is dediacritization processing. Diacritical marks are part of the Arabic language and are used on each letter to specify its phonetic pronunciation. Since it is almost impossible for any NPL tool to analyse diacritical marks, the tool provides a function to remove them from the text.
- Morphological Analysis: an example of this is generation which is the process of “inflicting a lemma for a set of morphological features”, for example it generates all the possible inflections for a word that are a masculine plural or dual feminine, depending on what the developer uses.
The tool also provides a sentiment analysis system and a few other NLP features. I specifically found the dialect identification feature very cool. The tool is able to classify a certain text’s dialect by city, country and region.
I will definitely be using CAMeL for a future Arabic text project!
For CAMeL documentation: https://camel-tools.readthedocs.io/en/latest/overview.html
I also signed up for the kickoff event but was only able to attend the networking session with Lisa Rhody and Madiha Choksy. As soon as I joined the zoom call, one of the participant had finished sharing the amazing work they were doing on language labs and the group decided to go down the list of participants to see if anyone wanted to share anything. Ironically, my name was the first to be called on. I thought it was a good idea to share one of the projects I worked on during Interactive Data Visualization, it was a way for me to improvise a presentation and get some feedback on my work. The project which you can take a look at here (https://nedjaem.github.io/med_migration_scrolly/) is still a work in progress. The visualization entitled “Migrants and State Violence in the Mediterranean” aims to highlight the migration events between 2014 and 2019 with a focus on the involvements of Nation States in the crisis. I went over the project rationale and some of the decisions I made during the project development. Overall, I was glad to have the opportunity to spontaneously present my work at the event.


