Copy
View this email in your browser
2019-02-06 | J++ Nyhetsbrev #68

Pandas is hard. Don't make it harder!

This week's newsletter is dedicated to coding. Coding for research, coding for vizualisation, and coding for fun!

Make Pandas easier

We use Pandas (a Python module for data analysis) a lot, and we have largely replaced Excel with Pandas code. But Pandas is far from easy to learn, and one of the reasons is that it suffers heavily from “feature bloat”. In trying to cater for all possible needs, it has had to implement a lot of features. That's why we agree with almost everything Ted Petrou writes in his Medium post from last week: Minimally Sufficient Pandas
https://medium.com/dunder-data/minimally-sufficient-pandas-a8e67f2a2428


Visualize in R

On BBC they have been using R, rather than Python, for their analysis. In this excellent write-up, they explain how they recently moved to doing not only analysis, but also visualization, in the same tool (using the ggplot2 R package).
https://medium.com/bbc-visual-and-data-journalism/how-the-bbc-visual-and-data-journalism-team-works-with-graphics-in-r-ed0b35693535
Pay attention to how they worked with a cookbook of shared knowledge, to collectively improve their chart-making in the newsroom:
https://bbc.github.io/rcookbook/



...or in Python

The Python equivalent to the ggplot2 R package would be Matplotlib. That too, is highly customizable, once you find out how to work it. We use it for our news service Newsworthy. Check it in action in the recently published reports on crime development in Swedish municipalities:
http://newsworthy.se/sv/report/crime/
We use our own wrapper around Matplotlib, to make it easier to create charts in a uniform fashion. That library is open source, available here:
https://github.com/jplusplus/newsworthycharts



Swedish language analysis

No matter what language you code in; If you ever do text analysis in Swedish, you'll want to bookmark Peter Dahlgren's collection of Swedish language data. Here you'll find thing like lemma dictionaries (for finding the right word stem (more or less)), stop words (for sorting out things like prepositions), and all the other stuff that's so easy to find for English, but yet so scarce for smaller languages.
https://github.com/peterdalle/svensktext
Copyright © 2019 Journalism++ Stockholm, All rights reserved.


Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list.

Email Marketing Powered by Mailchimp