Home | Lecture 6 | Problem 6.1 | Problem 6.2 | Problem 6.3 | Problem 6.4
Ah, well,” said Tonks, slamming the trunk’s lid shut, “at least > it’s all in. That could do with a bit of cleaning, too.” She pointed her wand at Hedwig’s cage. “Scourgify.” A few feathers and droppings vanished.
— Harry Potter and the Order of the Phoenix
Data, too, often needs to be “cleaned,” as by reformatting it, so that values are in a consistent, if not more convenient, format. Consider, for instance, this CSV file of students, before.csv, below:
name,house
"Abbott, Hannah",Hufflepuff
"Bell, Katie",Gryffindor
"Bones, Susan",Hufflepuff
"Boot, Terry",Ravenclaw
"Brown, Lavender",Gryffindor
"Bulstrode, Millicent",Slytherin
"Chang, Cho",Ravenclaw
"Clearwater, Penelope",Ravenclaw
"Crabbe, Vincent",Slytherin
"Creevey, Colin",Gryffindor
"Creevey, Dennis",Gryffindor
"Diggory, Cedric",Hufflepuff
"Edgecombe, Marietta",Ravenclaw
"Finch-Fletchley, Justin",Hufflepuff
"Finnigan, Seamus",Gryffindor
"Goldstein, Anthony",Ravenclaw
"Goyle, Gregory",Slytherin
"Granger, Hermione",Gryffindor
"Johnson, Angelina",Gryffindor
"Jordan, Lee",Gryffindor
"Longbottom, Neville",Gryffindor
"Lovegood, Luna",Ravenclaw
"Lupin, Remus",Gryffindor
"Malfoy, Draco",Slytherin
"Malfoy, Scorpius",Slytherin
"Macmillan, Ernie",Hufflepuff
"McGonagall, Minerva",Gryffindor
"Midgen, Eloise",Gryffindor
"McLaggen, Cormac",Gryffindor
"Montague, Graham",Slytherin
"Nott, Theodore",Slytherin
"Parkinson, Pansy",Slytherin
"Patil, Padma",Gryffindor
"Patil, Parvati",Gryffindor
"Potter, Harry",Gryffindor
"Riddle, Tom",Slytherin
"Robins, Demelza",Gryffindor
"Scamander, Newt",Hufflepuff
"Slughorn, Horace",Slytherin
"Smith, Zacharias",Hufflepuff
"Snape, Severus",Slytherin
"Spinnet, Alicia",Gryffindor
"Sprout, Pomona",Hufflepuff
"Thomas, Dean",Gryffindor
"Vane, Romilda",Gryffindor
"Warren, Myrtle",Ravenclaw
"Weasley, Fred",Gryffindor
"Weasley, George",Gryffindor
"Weasley, Ginny",Gryffindor
"Weasley, Percy",Gryffindor
"Weasley, Ron",Gryffindor
"Wood, Oliver",Gryffindor
"Zabini, Blaise",SlytherinSource: en.wikipedia.org/wiki/List_of_Harry_Potter_characters
Even though each “row” in the file has three values (last name, first name, and house), the first two are combined into one “column” (name), escaped with double quotes, with last name and first name separated by a comma and space. Not ideal if Hogwarts wants to send a form letter to each student, as via mail merge, since it’d be strange to start a letter with:
Dear Potter, Harry,
Rather than with, for instance:
Dear Harry,
In a file called scourgify.py, implement a program that:
- Expects the user to provide two command-line arguments:
- the name of an existing CSV file to read as input, whose columns are assumed to be, in order,
nameandhouse, and - the name of a new CSV to write as output, whose columns should be, in order,
first,last, andhouse.
- the name of an existing CSV file to read as input, whose columns are assumed to be, in order,
- Converts that input to that output, splitting each name into a first name and last name. Assume that each student will have both a first name and last name.
If the user does not provide exactly two command-line arguments, or if the first cannot be read, the program should exit via
sys.exitwith an error message.
- Note that csv module comes with quite a few methods, per docs.python.org/3/library/csv.html, among which are
DictReader, per docs.python.org/3/library/csv.html#csv.DictReader andDictWriter, per docs.python.org/3/library/csv.html#csv.DictWriter. - Note that you can tell a
DictWriterto write itsfieldnamesto a file usingwriteheaderwith no arguments, per docs.python.org/3/library/csv.html#csv.DictWriter.writeheader.
From the root of your repository execute cd 6-FileIO So your current working directory is ...
/6-FileIO $:Next execute
mkdir scourgifyto make a folder called scourgify in your codespace.
Then execute
cd scourgifyto change directories into that folder. You should now see your terminal prompt as /6-FileIO/scourgify $. You can now execute
code scourgify.pyto make a file called scourgify.py where you’ll write your program.
Here’s how to test your code manually:
-
Run your program with
python scourgify.py. Your program should exit usingsys.exitand provide an error message:Too few command-line arguments
-
Create empty files
1.csv,2.csv, and3.csv. Run your program withpython scourgify.py 1.csv 2.csv 3.csv. Your program should output:Too many command-line arguments
-
Run your program with
python scourgify.py invalid_file.csv output.csv. Assuminginvalid_file.csvdoesn’t exist, your program should exit usingsys.exitand provide an error message:Could not read invalid_file.csv
-
Run your program with
python scourgify.py before.csv after.csv. Assumingbefore.csvexists, your program should create a new file,after.csv, whose columns should be, in order, first, last, and house.
At the /6-FileIO/scourgify $ prompt in your terminal:
git add -A Add all changed files in the repository to be committed
git commit -m “Upload completed scourgify.py“Commit all changes in the REPO with the comment “Upload completed scourgify.py“ note: If the file is not complete, adjust the comment to describes what is being commited
git push Push all changes to the REPO