$$ \newcommand{\tp}{\thinspace .} $$

Dictionaries and strings

Hans Petter Langtangen [1, 2]

[1] Center for Biomedical Computing, Simula Research Laboratory
[2] Department of Informatics, University of Oslo

Dec 15, 2014

Table of contents

Dictionaries
      Making dictionaries
      Dictionary operations
            Remark
      Example: Polynomials as dictionaries
      Dictionaries with default values and ordering
            Dictionaries with default values
            Ordered dictionaries
      Example: File data in dictionaries
            Problem
            Solution
      Example: File data in nested dictionaries
            Problem
            Algorithm
            Implementation
            Dissection
      Example: Reading and plotting data recorded at specific dates
            Problem
            Solution
Strings
      Common operations on strings
            Substring specification
            Searching for substrings
            Substitution
            String splitting
            Upper and lower case
            Strings are constant
            Strings with digits only
            Whitespace
            Joining strings
      Example: Reading pairs of numbers
            Problem
            Solution
      Example: Reading coordinates
            Problem
            Solution 1: substring extraction
            Solution 2: string search
            Solution 3: string split
Reading data from web pages
      About web pages
      How to access web pages in programs
            Alternative 1
            Alternative 2
      Example: Reading pure text files
      Example: Extracting data from HTML
      Handling non-English text
Reading and writing spreadsheet files
      CSV files
      Reading CSV files
      Processing spreadsheet data
      Writing CSV files
            Remark
      Representing number cells with Numerical Python arrays
      Using more high-level Numerical Python functionality
Summary
      Chapter topics
            Dictionaries
            Strings
            Downloading Internet files
            Terminology
      Example: A file database
            Problem
            Solution
Exercises
      Exercise 1: Make a dictionary from a table
      Exercise 2: Explore syntax differences: lists vs. dicts
      Exercise 3: Use string operations to improve a program
      Exercise 4: Interpret output from a program
      Exercise 5: Make a nested dictionary from a file
      Exercise 6: Make a nested dictionary from a file
      Exercise 7: Compare data structures for polynomials
      Exercise 8: Compute the derivative of a polynomial
      Exercise 9: Specify functions on the command line
      Exercise 10: Interpret function specifications
      Exercise 11: Compare average temperatures in cities
References

The present chapter addresses many techniques for interpreting information in files and storing the data in convenient Python objects for further data analysis. A particularly handy object for many purposes is the dictionary, which maps objects to objects, very often strings to various kinds of data that later can be looked up through the strings. The section Dictionaries is devoted to dictionaries.

Information in files often appear as pure text, so to interpret and extract data from files it is sometimes necessary to carry out sophisticated operations on the text. Python strings have many methods for performing such operations, and the most important functionality is described in the section Strings.

The World Wide Web is full of information and scientific data that may be useful to access from a program. The section Reading data from web pages tells you how to read web pages from a program and interpret the contents using string operations.

Working with data often involves spreadsheets. Python programs not only need to extract data from spreadsheet files, but it can be advantageous and convenient to actually to the data processing in a Python program rather than in a spreadsheet program like Microsoft Excel or LibreOffice. The section Reading and writing spreadsheet files goes through relevant techniques for reading and writing files in the common CSV format for spreadsheets.

The present chapter builds on fundamental programming concepts such as loops, lists, arrays, if tests, command-line arguments, and curve plotting. The folder src/files contains all the relevant program example files and associated data files.