Author Archives: Eva Sibinga

Revised Project Proposal: Freedom of Speech*

Team Members and roles

Eva: Developer + research/text analysis
Joanne: Developer + research/text analysis
Kevin: UX/UI lead + research/text analysis
Martin: Project Manager + research/text analysis
Outreach: we will assign this task time and space, rather than assigning it to a specific person (i.e. we plan to devote collective and individual time to outreach on a weekly/biweekly basis, since there isn’t one obvious person to take the lead on it)

Abstract

Freedom of Speech* is a web project that helps users to understand the First Amendment right to freedom of speech through interactive visualizations of Supreme Court verdicts that have expanded or contracted the definition of “free speech” over the history of the United States. The project aims to dispel misconceptions about the First Amendment (specifically its blanket protection of freedom of speech) as static and limitless, and to illustrate how historical circumstances, diplomatic relationships, or the realities of race, class, religion, and other aspects affect whether and how free speech has been protected by the U.S. government.

We’ll focus on an effective combination of clean user interface and an entry-level approach to the gargantuan field of Supreme Court case precedent within the issue of First Amendment freedom of speech in order to capture a user’s attention for educational purposes. With those two foundations — clean UI and digestible content — in mind, this web project aims to offer a humanistic inquiry of the sticky relationship between the letter of the law and the cases that define its actual implementation through a robust, beautiful, functional, data-driven web app. Its focus on the goal of improving critical thinking skills, and fostering a better-informed civic populace around a topic that is today largely synonymous with social media, makes it poised to be an effective Digital Humanities tool.

As the U.S. comes to terms with what free speech means in the internet age, a baseline literacy and understanding of the concept becomes increasingly important; leaving the populace with a question that this project will be built to answer: What does ‘freedom of speech’ really mean?

Very brief environmental scan

Legal language is largely inaccessible to those without an educational or institutional background in law. In addition, Supreme Court verdicts are lengthy and tedious to read and understand. As such, most of the general public is unfamiliar with how the law works, and does not realize that the legislation and/or constitution are really just the beginning in determining what is considered “legal”: Indeed—as it pertains to the freedom of speech—while the First Amendment may be understood as a determining, theoretical base, individual cases and case law determine how the freedom of speech is practiced. Not only that, but most people do not even understand what the First Amendment is. Thus, Freedom of Speech* seeks to make case law (particularly Supreme Court verdicts) more accessible and easy to understand, as to better elaborate to users how particular issues (such as partisanship) relate to the freedom of speech.

While there are plenty of papers and case studies that have similarly tackled the lack of understanding surrounding the First Amendment, there are currently few data-oriented approaches. One similar project is the Supreme Court Database, which publishes data about every case that has been on the Supreme Court’s docket and has an analysis tool that allows users to select cases from a range of years and obtain a set of horizontal bar charts that show the frequency of the cases matching the parameters. However, while this project has an extremely broad scope with the aim to make case finding and analysis easier for legal scholars, Freedom of Speech* seeks instead to not only focus on the narrow topic of cases revolving around the issue of First Amendment freedom of speech, but also make SCOTUS-level case law more accessible to a general population. Additionally, while the Supreme Court Database’s analysis tool only provides simple bar charts that visualize the frequency of a certain subset of SCOTUS cases, our project seeks to create more complex data visualizations that illustrate the spatial dimensions of Supreme court verdicts: that is, visualize how the complex web of case law, courts, judges, appointing politicians, and political parties partake in determining how the freedom of speech is interpreted and understood.

There is a notable gap with regard to the intersection of innovative data visualization and legal data research, where you will find information about the Supreme Court on a site like Oyez, but not much in the way of visuals. We seek to fill that lacuna with this project—by creating a tool that eases the point of entry into the sphere of legal rights using clean web design and incisive visuals, we hope to open a door for others to create works that go beyond our scope and make the entirety of the legal field accessible to all.

What technologies will be used?

Most of our data work can be done with Python. We have already begun using BeautifulSoup to pull Supreme Court content from the Justia website and prepare it for text analysis. We may explore topic modeling or other forms of text analysis for analyzing and organizing the text data, but that determination depends to some extent on the cleanliness and existing categorizations of the text data we get from Justia, and on our ability to manually encode and clean it. We’ll likely use Voyant as a starting place that’s accessible to all members of our group, and may use Python or R for further text analysis if it seems interesting and worth the time and effort.

Web development will be done in HTML, CSS, and Javascript. Any data visualizations will likely be done in d3.js and vanilla Javascript. We may use Vue for a coding framework.

Individually, team members will likely use Observable notebooks and Jupyter notebooks for process-based coding and prototyping, as well as Figma for web design and UX prototyping. We’re using several collaborative platforms: Github for coding and data management; Trello for organization and project management; and Discord for communication.

While each of the technologies listed is familiar to at least one member of the group, few of them are familiar to all of us. We’ve established standing invitations to listen in/watch screen shares whenever it’s interesting or helpful, in service of learning from each other’s knowledge bases. If we need help outside of troubleshooting or what’s available through StackOverflow, we’ll seek outside help from Micki, Javier, or other founts of techno-knowledge.

How will the project be managed?

We will be using Discord as the main channel for chatting, voice communication, and screen sharing. Google Docs will be used for shared documentation, and Google Sheets for simple data sets. We will be creating a Github Organization in order to have team-based tools as well as a shared repository for this project, which will host not only our code but also our data and likely our final webpage. We have created a shared Github repository in order to share access to large data sets. In lieu of a traditional calendar, a dashboard on Trello will track all of our Milestones, Due Dates, shared Meeting and Class notes.

Milestones

Our broad overview of work is as follows: first, we will be cleaning and processing the data, scraping any further information we need and manually categorizing and preparing any content we need to. We will also be tagging text and topic modeling to make the data more robust. We envision this taking 2-3 weeks. During this time, we’ll be brainstorming outreach ideas. Afterwards, we will be drafting and revising our UX/UI vision, creating iterations in order to get to a design that meets our project’s goals. At this stage, we will also be prototyping visualizations in order to get a sense of how our data can best be communicated. This process should take 1-2 weeks. Once we have some of this UX/UI vision in place, including logo design and branding, we can also begin to do actual outreach that communicates our goals to potential partners and interested parties.

The next step would be wireframing the website, combining prototypes with the structure of the website to get a mockup of what will be the final product. This process should also take 1-2 weeks. After this, we will begin constructing the code scaffolding according to the wireframes and prototypes, and once the scaffolding is in place, we will be actively working on code in order to build out our vision. This stage will likely take up the rest of the time (3-4 weeks) as we refine our code and test for bugs. This time will also include the final outreach pushes as we find ways to share our project with the world and publicize its CUNY launch. Our final product will be an accessible, friendly, non-patronizing website that encourages critical thought while remaining mindful of cognitive overload, and will be deployed on GitHub Pages.

Collecting Twitter Data for Research // omg I love TAGS

I attended Francesca Giannetti’s Collecting Twitter Data for Research workshop on Wednesday of NYCDH week. I really enjoyed it, and more to the point, can envision myself potentially using Twitter data as part of my capstone project or further research. I particularly appreciated that Francesca began with an overview of Twitter to remind us all that Twitter data on a hashtag does not mean “what people think about this topic,” but rather can give us an idea of what (generally) college-educated, middle/upper class people in some countries (the U.S., Japan, India as the top three) have Tweeted about the topic. She did point out that for U.S. data, Twitter is particularly interesting because the user base has roughly equal percentages of Black, white, and Latinx users. Beyond being interesting background, it was a good reminder of responsible data collection practices to start the workshop off with some guidelines on who/what Twitter data reflects.

I was struck (again) by the breadth of knowledge that can be expected in “beginner” workshops. This one was probably only “beginner” for people who are already well-versed in troubleshooting tech problems, but I pleasantly surprised to find that I was somewhat familiar with the tech bill already and was able to keep up with all of it. A year ago I probably would have only been able to take in about half the workshop, and it’s always good to have reminders of my tech progress, which often feels slow in the day-to-day.

The workshop focused on two tools for Twitter data collection: TAGS (Twitter Aggregating Google Sheet) and rtweet, a Twitter package for R. I’m cagey about fawning over Google products, but… a custom Google Sheet made with Google Apps Script can be truly awesome. It’s like an Excel spreadsheet that invites developers to jury-rig and make it interactive and totally customized to a specific purpose using pseudo-Javascript. The end result is a custom Google sheet that can do fun/amazing code things without requiring the user to know any code! The coolest part (IMO) is that the developer can basically create new buttons or dropdowns in the Sheets menu— in the case of TAGS, there are buttons to run scripts that connect the sheet to your Twitter account, connect to the Twitter API, and run the collection script on whatever hashtag(s) and other parameters the user sets in the sheet. (Digression: If anyone’s interested, I have a great geo-coding Google script that allows you to put hundreds of addresses into a Google sheet and click a “Geocode” button to convert them all to lat-long.)

The major limitation of TAGS is that it only collects data from the last 6-9 days unless you pay for premium access to the Twitter API. That being said, you can apply for developer access to the Twitter API as a researcher, which ostensibly gives you premium access and the ability to collect historical/older Tweets. (But, that being said… I applied and outlined my intended research use and was asked for more information and haven’t gotten access yet. We’ll see how it goes.)

Rtweet is a little more intense on the coding side, but it allows you to collect historical Tweet data for free. I spent most of my workshop energy on TAGS, but am looking forward to spending some more time with rtweet to get that up and running. Francesca’s code includes excellent instructions, so I feel good about my ability to troubleshoot that in the next week or so. Overall, for me it was definitely a “teach a woman to fish” kind of workshop! If anyone is interested in the workshop materials or learning to collect Twitter data using TAGS or rtweet, I’d be happy to share the links Francesca gave and give whatever info I can.

Eva’s skillset

Hi! I’m Eva, a (nascent) web developer with an English (20th C African American lit focus) background. Here are my skills:

Project management: This is not my preference — I’m organized and generally good at sticking to deadlines (she said, on a post a day late), but I work better as a collaborator when I’m not project managing.

Development: My last three semesters as a full-time DAV student have definitely kickstarted my data-driven development skills. I’m comfortable in web development (HTML, CSS, Javascript), collaboration with Github, and data viz softwares like Tableau, d3.js, Mapbox and Arc or QGIS. I’d consider this a definite strength of mine at this point.

Design/UX: I have a decent baseline understanding here, but would absolutely love to support someone else who has knowledge/expertise with wireframing and other design skills. I’m trying to improve my skills in translating a design vision to actual shapes/colors/lines/fonts/icons etc, and would jump at the opportunity to support and learn from someone who knows more.

Outreach/social media: This is not so much my thing, but I have some experience leveraging my network to get word out about projects, and am happy to send emails/make calls under someone else’s direction.

Documentation: I’m a strong writer, and a big goal of mine is documenting projects in ways that make them accessible and shareable. I want to make high quality academic work that is relevant and accessible to the public.

Research: I’m strong here — I have a liberal arts background and was an English major with a focus on critical race theory and DH. I’ve done a few projects with heavy literary DH research and writing in the last couple years, and my research interests include data ethics and the intersections of race, labor, and algorithmic control.

DHUM 70002 Digital Humanities: Methods and Practices (Spring 2021)

Author Archives: Eva Sibinga

Revised Project Proposal: Freedom of Speech*

Collecting Twitter Data for Research // omg I love TAGS

Eva’s skillset

Need help with the Commons?