Skip to Main Content

Introduction to Computer Science

Computer Science Department

Stop and Frisk – 80 course points

In this assignment, you will analyze real CSV data from the New York Police Department from 4 prominent years: 2013, 2014, 2021, and 2022. Each of these files consists of Stop and Frisk cases that occurred in the respective year. You will practice reading in data from CSV files, using arrays and ArrayLists, referencing objects, and the object-oriented programming (OOP) paradigm.

Refer to our Programming Assignments FAQ for instructions on how to install VScode, how to use the command line and how to submit your assignments.

Programming

The assignment has two components:

  1. Coding (77 points) submitted through Autolab.
  2. Reflection (3 points) submitted through a form.
    • Submit the reflection AFTER you have completed the coding component.
    • Be sure to sign in with your RU credentials! (netid@scarletmail.rutgers.edu)
    • You cannot resubmit reflections but you can edit your responses before the deadline by clicking the Google Form link, signing in with their netid, and selecting “Edit your response”

We provide a zip file in Autolab containing StopAndFrisk.java and input files.

Observe the following rules:

  1. DO NOT add any import statements
  2. DO NOT add the project statement
  3. DO NOT change the class name
  4. DO NOT change the headers of ANY of the given methods
  5. DO NOT add any new class fields
  6. DO NOT use System.exit()
  7. YOU CAN create helper methods if needed, but please ensure they are private (encapsulation!).

Overview

The goal of this assignment is to allow you to draw conclusions about social bias using real data from the New York Police Department. In 1964, the New York State passed a policing law called “Stop and Frisk” which enables officers to stop and “frisk” (briefly search) a person that they deem to be suspicious. A person who is stopped has not necessarily committed a crime; people are presumed innocent until proven guilty by the courts. However, data has shown that there exists significant discrimination, including racial and gender bias, regarding which individuals are more likely to be stopped.

Feel free to read more about the history of the Stop and Frisk Policy in New York to get a better understanding of social bias’ it indicates: https://www.nyclu.org/en/closer-look-stop-and-frisk-nyc

This assignment will allow you to deeply explore four years in New York’s Stop and Frisk history which allows you to see the difference (if any) in Stop and Frisk logistics from 2013 and 2014 to more recent years (2021 and 2022). These years were specifically chosen to analyze during and after the policy was put into place.

You will implement a total of 6 methods to analyze and broaden your understanding of this topic. 

Implementation

Overview of Files

DO NOT edit ANY class besides StopAndFrisk.java.

  • StopAndFrisk: holds all of the methods to be written and that will be tested when running the analysis of the CSV files. Edit the empty methods with your solution, but DO NOT edit the provided ones or the methods signatures of any method. This is the file you submit.
  • SFRecord: holds a singular Stop and Frisk record’s information. Contains getter methods to access the several parameters of a record.
  • SFYear: represents one particular year, which contains an ArrayList of all the Stop and Frisk records for that year. Also contains getter methods to retrieve the year number you are analyzing, and all the records for the year. Contains an “addRecord” method to add an additional record to a year.
  • Driver: A tool to test your StopAndFrisk implementation interactively. You are free to use this driver to test all of your methods. Feel free to edit this class, as it is provided only to help you test. It is not submitted and/or graded.
  • StdIn and StdOut: libraries to handle input and output. Do not edit these classes.
  • CSV files: Files containing year’s record information to be tested by the driver (2013.csv, 2014.csv,2021.csv,2022.csv). Feel free to manipulate them as they will not be submitted to Autolab.

StopAndFrisk.java

The class contains 1 (one) instance variable:

  • ArrayList database

The database ArrayList keeps track stop and frisk records by year.

  • Each SFYear corresponds to 1 (one) year of SFRecords.
  • Each SFRecord corresponds to 1 (one) stop and frisk occurrence.

Methods implemented by you:

For all of these, look at the comments above the methods for more information. 

1. readFile()

This method reads stop and frisk records from an input CSV file. Files in CSV format are plain text files that store data by delimiting data entries with commas.

For each record read from the file, this method creates a SFRecord object and inserts it into the corresponding year in the database.

Remember that you must call readFile() every time you test a new CSV file, before calling any other method.

This method must be completed before testing the following methods.

To complete this method:

  1. Read a record from the input file using the .split method (see below). Each line in the input file represents a stop and frisk record. 
    1. Remember that for every CSV file, the very first line is a description of the categories, so you must ensure you skip over it. 
  2. Instantiate a new SFRecord object to store the record information just read from the file.
  3. Check if the database ArrayList contains the year from the SFRecord object. 
    1. If yes, add the record to its designated year.
    2. If no,
      1. instantiate a new SFYear, and
      2. add the record to that year, and
      3. add the year to the database array.

Please refer to the diagram below to get a better visual understanding of the database.

As mentioned previously, you have been provided 4 CSV files to test the methods on. The format for each of them is as follows:

  • A CSV (Comma Separated Values) file has its values separated by commas. 
  • Each file contains (comma separated) values, and you are responsible for retrieving the SPECIFIC index values provided to you. Those are: 
    • Description (String – ALL CAPS) 
    • Gender (String – “F” (female) or “M” (male)) 
    • Arrested (“Y” (yes) or “N” (no) boolean)
    • Frisked (“Y” (yes) or “N” (no) boolean) 
    • Race (String – “B” (Black), “W” (White), “A” (Asian)) 
    • Location (String – ALL CAPS)

Use the StdIn library to read from a file:

  • StdIn.setFile(filename) opens a file to be read

To read one record do:

String[] recordEntries = StdIn.readLine().split(",");

int year = recordEntries[0];

String description = recordEntries[2];

String gender = recordEntries[52];

String race = recordEntries[66];

String location = recordEntries[71];

Boolean arrested = recordEntries[13].equals(“Y”);

Boolean frisked = recordEntries[16].equals(“Y”);

2. populationStopped

This method takes in two parameters (year and race) and returns an ArrayList of records of people of that specific race stopped in that specific year.

Remember, every one of these records refers to a time that a person was stopped, you are just finding all records of stops from the parameter year where the person that was stopped is of the parameter race.

When testing the method in the Driver, you will notice that a number is outputted instead of the ArrayList. This number is used to show the size (the number of people) based on the ArrayList that you return in your method. Seeing the number will allow you to put into context whether or not racial bias exists in StopAndFrisk cases.

To complete this method: 

  1. Create the ArrayList to be returned.
  2. Access the specific year you are looking for in the database array.
  3. Traverse the records in that year and keep track of the ones that match the specific race we are looking for. 
    1. If a record matches the specific race criteria, store this record into the ArrayList from step (1).
  4. Return new ArrayList.

Suggested getter methods to use:

  • getCurrentYear()
  • getRace()
  • getRecordsForYear()

This is the expected output for the Black population in 2013.csv:

3. friskedVSArrested

When a person is stopped they might be frisked, arrested, or not even frisked and then let go. It is also possible that the person is frisked and arrested.

This method outputs the percentage of records where the person was frisked AND the percentage of records where the person was arrested in the parameter year. 

To complete this method:

  1. Create two counters, one for the population frisked and the other for the population arrested.
  2. Access the specific year you are looking for in the database array.
  3. Traverse the records in that year updating the counters.
    1. Utilize getFrisked() and getArrested() to check if the person was frisked and/or arrested.
  4. Compute the percentage once all databases has been traversed.
    1. percentage = counter / number of records for the year * 100;
  5. Create a 1D output array that contains two values (in the following order):
    1. output[0]: the percentage of the population that frisked
    2. output[1]: the percentage of the population that arrested
  6. Return the output array.

This is the expected output for 2013.csv:

4. genderBias

This method creates a 2D array containing the percentage of of Black females, White females, Black males, and White males who were stopped for any reason. This gives an idea if there is gender bias on any given year of stop and frisks.

To complete this method: 

  1. Access the specific year you are looking for in the database array.
  2. Create counters to keep track of the number of black people who were stopped, the number of white people who were stopped, as well as the number of black men who were stopped, the number of white men who were stopped, the number of black women who were stopped, and the number of white women who were stopped.
    1. Utilize the SFRecord getRace() method to identify a person’s race
      1. If the race equals “B” (Black), increase the Black population count.
      2. If the race equals “W” (White), increase the White population count.
    2. Utilize the SFRecord getGender() method to identify a person’s gender
      1. If the Arraylist contains “F” (Female), increase the Female count, accordingly.
      2. If the Arraylist contains “M” (Male), increase the Male count, accordingly.
  3. Create a 2D array of 2 rows and 3 columns containing the information in the table below. For example, to compute the Black Female Percentage use:
    1. percentage = (number of black females / number of black people) * 0.5 * 100
  1. Return the 2D resulting array

Note 1: We are looking only at two races (black and white) in this method. Total female percentage refers only to those two races, the same apply to total male percentage.

Note 2: The total percentages should not be above 100%

This is the expected output for 2013.csv:

5. crimeIncrease

This method returns a double value representing the percentage of crime increase (if the number is positive) or crime decrease (if the number is negative) between any 2 years. The parameters for this method are: the description of the record, year 1, year 2.

Let us look at 2 examples. If the crime description is FELONY, year 1 is 2013 and year 2 is 2014, then when you read in both csv files and input “FELONY” for the crimeIncrease method, you are expected to get an output of 1.76%, indicating that percent of felonies have increased by 1.76 percent from 2013 to 2014. 

In another example, if the crime description is ROBBERY, year 1 is 2013 and year 2 is 2014, then when you read in both csv files and input “ROBBERY” for the crimeIncrease method, you are expected to get an output of -0.41%, indicating that the percent of robberies have decreased by 0.41 percent from 2013 to 2014. 

***In the real world context – We always want a decrease in crime as the years progress. See if that’s the case or not! 

To complete this method: 

  1. Access the specific years you are looking for in the database array.
  2. Traverse through the 2 years records looking for the specific crime description.
    1. Keep count of the records that the specific crime description occurs.
    2. Instead of using the String class equals() method to compare the record’s description to the parameter, use the String class indexOf(description) method to check if record’s description contains the parameter. 
  3. Calculate the individual percentages for each year. 
  4. Return the double value which is either (positive or negative) and shows an increase or decrease in crime between the two years. 

Notes

  • Make sure that all the records for both years are in the database prior to running this method. To insert a year into the database run readFile() for each year.
  • Consider edge cases: year1 cannot be greater than or equal to year2

This is the expected output for 2013.csv (CPW stands for Criminal Possession of Weapon):

6. mostCommonBorough

This method outputs the NYC borough where the most amount of stops occurred in a given year. This method will analyze the five following boroughs in New York City: Brooklyn, Manhattan, Bronx, Queens, and Staten Island.

To complete this method:

  1. Access the specific years you are looking for in the database array.
  2. You will need to keep a counter for each location, there are only 5 (Brooklyn, Manhattan, Bronx, Queens, and Staten Island).
    1. Utilize getLocation() to find the borough’s name.
    2. Note: all the cities in the CSV are in all caps (e.g. “BROOKLYN”)
      1. Nonetheless, it is safer to use the String class equalsIgnoreCase() when comparing the location.
  3. The method returns the borough with the largest number of stops in the year.

There are several ways to do this. One way is to use 2 (two) parallel arrays: one for the counters and one for the borough names. 

Once you have the counts for each borough you can use 2 (two) regular arrays or 2 (two) ArrayLists. The figure below shows two parallel arrays. This means that the value in counts[i] refers to the number of stops in borough[i]. For example, counts[2] contains 43, therefore 43 is the number of stops in borough[2] which is the Bronx.

  1. Traverse the counts array to find the index with the largest value, say i. In the example below the largest value is at index 0.
  2. Return the borough from the same index, borough[i]. In the example below that will be Brooklyn.

This is the expected output for 2013.csv:

Before submission

Collaboration policy. Read our collaboration policy here.

Submitting the assignment. Submit StopAndFrisk.java via the web submission system called Autolab. To do this, click the Assignments link from the course website; click the Submit link for that assignment.

Getting help

If anything is unclear, don’t hesitate to drop by office hours or post a question on Piazza.

  • Find instructors office hours here
  • Find tutors office hours in Canvas -> Tutoring, RU CATS
  • Find head TAs office hours here
  • POST on Piazza in Canvas -> Piazza
  • In addition to office hours we have the CAVE (Collaborative Academic Versatile Environment), a community space staffed with lab assistants which are undergraduate students further along the CS major to answer questions.

This assignment uses data from New York City Policy Department.

Assignment by Tanvi Yamarthy and Vidushi Jindal