Goal Develop an experience with providing end users with access to data
The Story
As a tourist and movie fan, where can I get information to help arrange the tour in San Francisco?
Tourists in a city might be interested in places that are special to them. As movie fans, my group mate and I are interested to know if there is data about where to explore for those places ever appeared in the movie. Plus, it helps to understand this city and its culture by knowing a bunch of places that have some films shot there.
Motivation
If you love movies, and you love San Francisco, you're bound to love this -- a listing of filming locations of movies shot in San Francisco from 1905 - 2014. Also, if applicable, the approach that we have here can be applied to more cities to help people explore especially the tourism places with more interesting perspectives about films.
Project Goals
1. With respect to movie-related information (the ratings, runtime, and popularities), we want to provide visually helpful information for analysing the underlying pattern about the relationship of time, locations, and movie itself, so that it help the tourists to know more context of that movie and association to that address as time goes by;
2. Help people to explore data from the historical movie shots' perspective;
3. Provide insights about the most popular places and its locations in the city.
Data Selection Develop experience with selecting data that can effectively tell the story we wish to tell
Data Set
The primary dataset we have selected is available here at SFGov's database.
You'll find the titles, locations, fun facts, names of the director, writer, actors, and studio for most of the San Fran shot films from the "Film Locations in San Francisco" dataset.
What's Missing to Complete Our Goal
For the 1st goal of our project, it is still missing for the numeric data about the movie, so we utilized the
OMDB API, which is an open source site that grab movie information from the IMDB (world's largest movie database), to supplement the information.
Example of the URL we used for accesing the api:
http://www.omdbapi.com/?r=json&t=Forrest%20Gump&y=1994, where we searched the film of "Forrest Gump" in the year of 1994, and the return value will be in Json format.
For the goal of plotting places on a map-based visual, it is most helpful to access latitude and longitude information, and we chose to use one of the world's largest public supplier of geographical info,
Google Place API.
The reason is that for the location information in our database, it is poorly formatted with no geo-coded addresses included, and the google place api can provide vague term search in its text search service to return optically matched latitude and longitude information.
Example of the URL we used for accessing the api:
https://maps.googleapis.com/maps/api/place/textsearch/json?query=golden+gate+bridge+san+francisco&sensor=true&key={YOUR API KEY WITH GOOGLE PLACE}, where we searched the film of "glden gate bridge" in San Francisco. The return values will contain the place's information in Json format, and we just want the latitude and longitude of that place. As there might be multiple return values and are in a sorted order contained in JSON objects, we followed what Google's map search intelligence had put in the first object as the place we want. To ensure that's the closest value that we want, we did more related data cleansing process, and please see about them in the
"cleaning and transforming" part in our documentation.
Dimensions for the Motion Chart
From the primary dataset:
Primary key: Title, Release Year, Locations
- Title
- Release Year
- Locations
- Director
From the external resource (OMDB API):
Primary key: Title, Year
Used java to retrieve information via http connection from the api, specifying movie name and release year as the key in the requesting URL.
- IMDB Rating
- Genre
- IMDB Votes (popularity information)
- Runtime (how long the movie lasts)
Dimensions for the Places Insight Charts
From the primary dataset:
Primary key: Title, Release Year, Locations
We aggregated the data by locations and counted how many different movie are shot in that location.
- Title
- Release Year
- Locations
From the external resource (Google Place API):
Primary key: Locations
Used java to retrieve information via http connection from the api, specifying location description identified from the primary data set.
- Locations
- Latitude
- Longitude