Assignment 4

Redesign of WalkScore

February 16, 2010

in Student Posts

Puneet & Danielle

Problem -> abstraction -> encoding -> implementation

Munzner describes the above process as ‘domain problem characterization, data/operation abstraction design, encoding/interaction technique design, algorithm design’

Deconstructing Walkscore
————————
Problem: Assess livability of a neighborhood by how far one has to walk for different services. The idea is that places you would usually walk to — parks, neighborhood grocery stores, restaurants and coffee shops — make up the social fabric of a neighborhood.

Abstraction: Livability is abstracted to availability of different services within walking distance.

Encoding: Map position to position, distance to color

Implementation: Show the results on a map.

Suggested improvement: Walkscore is a wonderful way to visualize and assess livability of a neighborhood. The improvements I can think of are: a heat map that creates a color gradient from green to red, where greener is closer and redder is farther from the origin. Ironically, Walkscore does implement a heat map, but only as a pre-created map for certain neighborhoods. As far as I can see, it does not have a facility for the users to create one for their own neighborhood.

An additional interface element to implement would be to weight the various walking destinations — for some, having a grocery store within walking distance may be more important, while for others, existence of neighborhood coffee shops or parks where neighbors gather may be more important. Being able to move sliders for various destination categories and watching the heat map change in real-time would be one one meaningful improvement that I can think of.

With this change, the user does not necessarily get a better or worse view of the information, but simply is presented with a different interpretation. If someone were looking to explore a particular area, the original encodings would be ideal to use. However, if someone was looking to make a decision about a location based on factors important to them, the heat map would provide a very straightforward basis for such a decision.

Puneet & Danielle

Deconstructing Multivariate Volumetric Vis
——————————————
Problem: Display the relationship between two different sets of multi-variate data in a single view.

Abstraction: The relationship between the two data sets is abstracted into a function of the predefined logical operators over the data set. The likely reasoning for this choice is to simplify the expression of the relationship of the data across a large number of variables. The exact values of the data are presented over a relative scale with respect to one another, again most likely in order to simplify visual representation. These abstractions are very logical decisions as far as simplifying the display and interpretation of a complicated body of data given that the purpose of this tool is more to demonstrate that such data can be shown, not necessarily in the most detail-oriented manner possible.

Encoding: The value of the variables is mapped to color. This is a fairly logical choice assuming an appropriate mapping of value to color. Additionally, the relationships of the variables between the data sets as expressed by the logical function are mapped to position. This decision appears to be motivated by the spatial nature logical expressions used to compare the data sets, implying a spatial relationship between data encoded in a spatial mapping.

Implementation: Generate a visual volume created by the above encodings in a digital format.

Suggested improvement:

While the visualizations generated by this system result in very aesthetically pleasing images, the outputs in their current state mean nothing. The encodings themselves have been abstracted to solely a relational level, but the nature of this relationship is left completely unspecified. One possible improvement for the representation is to map the values over an absolute and explicit color scale, with distinct visual intervals such that different data intervals are perceptually distinct from one another.

Additionally, the positional encoding of the data is completely uninterpretable at present. Essentially, the points mapped to the volume are mapped without regard for the user as there is no way to abstract any significant detail about the positional relationships from the visualization. One possible choice to improve the positional encoding of the data within each data set would be to give an explicit dimensionality to the different variables encoded in position in order to complement the spatial relationships encoded in the defining relational function.

These changes would likely detract from the aesthetics of the original image. However, they would make the visualization interpretable. By adding more explicit data to clarify the encodings, the visualizations become more about the information than art. Ideally, unlike the included redesign, transparency in the colors would be retained in order to maintain the volumetric aspect of the visualizations. However, by integrating more information about the encodings into the actual visual space, the reader can better understand what information is being presented. This new, more informative encoding could also then be implemented over an interactive platform to allow the user to navigate through the visualized data set. This sort of experimentation, however, was not done for this assignment.

HW4b Redesign the Bad

February 16, 2010

in Student Posts

We chose to redesign a “future trend map”:
http://nowandnext.com/PDF/trends_and_technology_timeline_2010.pdf

This visualization encodes five dimensions of data in the following ways:

– “time zones”– radial distance from the center of the map, hue
– phenomena – text labels, position on category lines, connection?
– category of phenomena – hue
– type of phenomena – shape, glyphs
– global risks – bulleted list, containment?

The biggest problems we had with this visualization were:
– There is no easy way to find things – the subway lines weave and snake all over the place;
– The color encoding uses too many hues;

– There is no clear rule about how to interpret spatial locality, in terms of whether nearby trends relate to each other.

How we address these things:
– We straightened out the lines so that you can follow each trend more linearly.

– W grouped some of the trends into larger trends (which don’t really seem to be all that distinct from each other anyway) so that there are fewer divisions in the data.

– The implicit groupings based on spatial locality should not be preserved if they are not explicitly linked.

Our redesign is shown here. Not all of the trends (i.e., subway stations) are shown here because there were too many – but you can imagine what it would look like if they were all there. Major intersections of groupings of trends (subway lines) are shown on the left as connecting lines. The 7 aggregated trends are shown by color at the left, and the various “time zones” are shown at the bottom. The intent here is to make the future look a lot more boring — and predictable.

first redesign

first redesign


Second redesign

For the second redesign, we wanted to compare the effect of grouping the trend-lines by leaving each one as a separate row (i.e., this time we didn’t group them). We reduced clutter by removing the connecting lines to show “megatrends” and replaced those with entries in dedicated columns. Time was assigned to color to emphasize the qualitative difference in predictions that were further out. Again, the effect is for minimalism and ease of use, rather than impressing the viewer with how messy and complicated the future is

Second redesign

second redesign

One possible drawback is that the original graphic supports a more un-ordered, wandering traversal, following from one trend to another – however, the readers of this graph can let their eyes wander across it as well. The only difference is that we have not preserved the intersections of the trend-lines, because they were not explicitly said to be meaningful.

Leslie Watkins & Chris Hinrichs

Team member:  Ye Liu only

Source: http://www.willisms.com/archives/2005/03/the_american_em.html

Source: http://graphics.cs.wisc.edu/Courses/Visualization10/archives/661-assignment-4a-critique-by-ye-liu

Analysis:

Problem definition: Show the audience the military expenditures of the 16 nations who have the highest military cost. Especially, pointing out America has the largest military expenditure, whose absolute value is almost the total summary of the other 15 top nations.

Data:

A: The names of the 16 nations who have the highest military expenditures.

B: The military expenditure amount s for each of the nations.

Abstraction:

Show comparison of the military expenditures of the 16 nations. Emphasize America as it has the highest largest military expenditure.

Mapping and encoding:

  1. Using a pie diagram to compare the military expenditures for different nations.
  2. Using different colors to mark for different countries, using different sizes (or central angles) of the pie partitions to show the amount of the expenditures.
  3. Drawing a meshed national flag in the U.S. partition to emphasize it.

Drawbacks:

  1. Mapping colors to 16 countries are challenging audiences’ perception limits. It’s very hard to distinguish every color and map them to different countries.
  2. Much of the information is omitted, including the absolute value of the expenditures and the percentage.
  3. The “emphasizing” on the U.S.A. are not successful as the meshed national flag is hard to recognize, and causing a lot of misunderstanding, because it’s blurred.
  4. The edges of the pie partitions are not refined, and the figure seems very coarse.

New design:

Using a 3-D pie diagram with fine edges and every nation directly annotated on the pie partitions would be a way for improvement. It does not need the audiences to map colors to countries, as it using both color and position to map the data and reduce cluttering. It can also provide data such as the percentage or the absolute value of the expenditures. It uses a larger font along with the percentage to emphasize the overwhelming expenditure for the U.S.A. It looks much smooth too, which would be more attractive and pleasant for audiences to view.

Another method would be using a world map again. The figure underneath shows an incomplete work but can already prove my point. Mapping countries to their own positions on the map with colors would be an efficient way for audiences with a little geography, not to mention we have much place to annotate the name of the nations and list their expenditures. Relatively loose data arrangement would reduce the cluttering problem and make the map fancier to view.

Analysis II: Bank Graph

February 16, 2010

in Student Posts

Market Cap of Banks:

The original graph was this:

The data for this visualization are 15 banks, and their market value at two distinct points of time.

It encodes bank to position, market value to size, and time to color and one might argue position.

Market Value: Diameter of a circle
Bank: Each bank has a separate position on the visualization
Time: As time is binary in this case, time is blue for one date, green for the other, the green circle for a specific bank is also within that banks blue circle, sharing the same bottom.

The first mapping simply fixes the problem of diameter being misleading to comparing circle sizes. Instead, the market value is mapped to area. In this example, there is a mockup involving seven of the banks where the only difference from the original is the sizes of the circles, reflecting this difference.

The second visualization creates a bar graph with the following encodings:

Market Value: Height of the colored portions of the bar
Bank: Each bank has a separate position on the bar graph
Time: Time is still binary, so color is still used, in this case blue and grey

The bars are also positioned from least previous initial market value to greatest from left to right.

The first altered visualization isn’t much of an improvement from the original. While it gives a better first glance look at the difference between market values at the different time within an individual bank, it still is not easy to compare between banks and the size of circles still isn’t the best way to display quantified data.

The bar graph is much better than either visualization. It’s easy to compare market values within one bank and between banks. The banks are also ordered in some fashion, whereas in the original visualization, there did not seem to be any reason for the order chosen. We’ve also chosen to emphasize new market value, as we believe it was the more important of the two data sets.

Crayon Chart:

Problem: Show the additions and changes to the manufactured crayon colors over the last 107 years
Abstraction: Color and time
Encoding: Color is mapped to both color and position, and time is mapped to position.
Implementation: Vector (likely) based static image

The original graph uses color to represent the actual color of the crayons. This literal use of variables is similar to mapping position to spatial data, where the required level of interpretation by the reader is reduced through the lack of non-intuitive associations. Improvements were made by including additional data on the top of the graph denoting the number of total colors, along with the number of colors added and subtracted, for each time stamp. The additions increase the visual noise, but allows the reader to extract data easily.

We redesigned this graphic from National Geographic on the costs and benefits of healthcare:

The Cost of Care

The data consists of a series of points, (one corresponding to each country surveyed.) Each point has 3 values associated with it: $ amount spent per capita (quantitative), expected life span (quantitative),average  number of doctor visits per year (ordered – the original graphic condensed this number into 4 bins,) and whether or not the country has a public health insurance system (categorical – all countries have a public health insurance system except for the US and Mexico).

The visual encoding used in the original graphic was:

–       Cost of healthcare per person- y position

–       Average life expectancy- y position

–       Average number of doctor’s visits per person- line thickness

–       Type of coverage (universal or otherwise)- hue

Good points with this design are that vertical position was used to encode the 2 most important variables, cost per capita and expected life-span. As an added bonus, the (scaled) difference between these 2 quantities falls out as the slope of the line representing each country. The average number of doctor visits per capita per year is encoded as the thickness of the line. Thickness is not listed in Munzner’s table of visual channels, (p. 683) though it can be interpreted as a kind of length. The author’s intent was to de-emphasize the number of visits, which may be why it was binned, and not given a more prominent channel.


Redesign 1:

For the first redesign, we decided that it might be an improvement to remove all of the line crossings, as it may reduce the visual clutter. Thus, we made each line vertical, so that now its top y-coordinate encodes cost, and its bottom (negative) coordinate encodes expected life-span. We left doctor visits out, as there was no simple way to encode this as line thickness in Matlab (that we have found). Color (hue) was used to show the final variable, existence of a national health care system. Ideally, the countries should be labeled, and if it were feasible to do so, we would have. A remaining issue is how to populate the new free variable, the x-coordinate of each line. In order to better highlight the underlying trend from the original graphic, used the (scaled) difference between cost and lifespan as the x-coordinate, which is a stronger channel than slope, as in the original graphic. An advantage is that it can be seen that of all the countries listed, Mexico actually has the lowest difference, which was not easy to detect from the original graphic.


Redesign 2:

For the second redesign of this graphic, we chose to encode average cost as the vertical position of each line, and expected lifespan as length. The idea here it test whether anything is lost by using a slightly weaker channel for one of the main variables, and also to see if there is a discernible pattern in the ratio between cost and lifespan, encoded as the slope between the tips of each line and the origin. This time, we encoded number of doctor visits as the radius of a circle centered at the middle of each line, with the color of the circle representing the categorical variable, existence of a national health plan.

One drawback to this approach was that the circles tended to overlap a bit, making it difficult to connect each line with its corresponding circle. Also, the ratio of cost per expected year of lifetime does not pop out quite as well as expected. An advantage is that the 2 variables are not fused into a single line, and are easier to read separately. Also, the outliers are still apparent in this view.


As a final comparison, the author himself considered a scatter plot as an alternative, but rejected this in favor of the original plot above:

http://blogs.ngm.com/blog_central/2010/01/the-other-health-care-debate-lines-vs-scatterplot.html

-Leslie Watkins & Chris Hinrichs

Visualization Assignment 4-b

By Shuang Huang & Faisal Khan

This posting is about the re-design of the Time magazine visualization here (http://www.time.com/time/2007/america_numbers/commuting.html).  This was about showing average commute time across major U.S. cities.

Original encodings

Here are the encodings used in the original map:

-Cities

Position on 3D US map

-Average Commute time:

length:  Different size bars proportion to the length of the commute time were erected at the geographical location of different major cities.

Color: They used color saturation to also encode the same information. Each bar was colored using the saturation value proportions to the average commute time.

New encodings

A simple solution is to use a 2D map instead of original 3D map for representing geographical position of each city. We can use the original color saturation values to color a rectangular region around the location of a city thus showing the average commute values. Apparently there doesn’t seem to be lot of deviation in the commute times across our chosen set of cities. Thus, it might be more useful to use discrete colors to use in which range commute value for a particular city falls in. Below is a rough representation of this new design.

We did come up with another design in which we made an attempt to show the relationship between cities, the ones having similar commute values. We thought this might be useful especially if the commute (or some similar quantity) vary significantly in their values. In this new design we used a 3D map to encode cities position as usual. A color bar representing range of values is placed on the top of this map.

To show the correspondence between these values and cities we can use connections. To reduce the clutter an interactive plot can be made that highlights connections based on the selection in the color bar region.

Team member:  Ye Liu only

Original Design:
Cover of Independent, U.K., Jul 21, 2006

Source: http://graphics.cs.wisc.edu/Courses/Visualization10/archives/661-assignment-4a-critique-by-ye-liu

Analysis:

Problem definition: The authors would like to pass to their audience the information as following:

  1. The U.N. has made a call for an immediate ceasefire;
  2. There are 3 country, including U.S.A. and U.K. who do not back up this call;
  3. Most important, there are only 3 country, including U.S.A. and U.K. who do not back up this call;
  4. Implication the authors don’t back up their governments behavior.

Data:

A: attitude towards the U.N. ceasefire call, a Bool type categorical variable, only has “yes” or “no” values.

B: nations. In this case, we only consider one properties: their attitude towards the UN ceasefire call.

Abstraction:

Compare the nations who back up the UN call, and those who don’t.  The fact that U.K. and the U.S. is the absolute majority can state anything itself. The implication can be made that U.K. and the U.S. is against peace, so they are against the world.

Mapping and encoding:

  1. To emphasize the equality of all nations in the world, all other properties expect for the name and symbol of the nations are abandoned from the illustration.
  2. Using national flags of the same sizes as the symbol for the nations to further emphasize the equality.
  3. Put the nations with the answer “yes“ and those with the answer “no” side by side to compare.
  4. Split the two categories from a standard list of national flags, which has more implications that the entire world is composed by the nations with different opinions.

Drawbacks:

No known drawbacks with in this specific case. Someone might argue that it uses a big illustration to state for a simply fact, and omits a lot of the details of the nations. It might be hard to count for those nations who back up the U.N. call. But the authors use the illustration to create more emotional attractions, and to emphasize the abnormal behaviors of U.K. and the U.S.

New design:

I’m using different colors for positions and areas of different nations in the world map to compare their attitude. The good point is that it still shows an obvious comparison and still indicates that U.K. and the U.S.A are the majority. However, the drawbacks are:

  1. Too complicated. Coloring different areas in the world map is not an easy job.
  2. The contrast and the visual shock are reduced as U.K. and the U.S.A. occupies a large territory, which seemingly enhanced their power to say “NO”.

By Faisal Khan and Shuang Huang

Source:

http://graphics.cs.wisc.edu/Courses/Visualization10/archives/471-bank-graph

This design is believed to be a bad one, since it gives the audience the information that JP Morgan performs better than all the competitors. The comparison is a bit misleading by using the size of circles.

Problem definition:

The graph is to show the shrinkage of banks from 2007 to 2009.

Data:

1.    Time: Two time points, Q2 2007 and 01/20/2009.

2.    Bank: A categorical variable.

3.    Market value: The market value of each bank at each time point, and unit is billion dollars.

Abstraction:

Compare the difference of performance among banks. Especially, show JP Morgan’s good performance.

Mapping and Encoding:

  1. Each bank’s market values are shown in two circles, old one surrounding current value.
  2. Green represents current market value and blue means old one.
  3. The value is proportional to circle size. So the size of the circle represents the market value.
  4. The banks are ordered arbitrarily.
  5. Citibank, which does not perform well, is placed in the center of the graph.

Drawbacks:

  1. The graph tends to show JP Morgan performs best. Actually, it is not the best neither in current market value, nor percentage of shrinkage.
  2. The order of placing the banks is arbitrary.
  3. There are no specific criteria to compare the performance.

New graph:

  1. It orders the bank based on the current market values.
  2. It shows the percentage of shrinkage clearly by listing the exact number.
  3. It uses more common technique and is easy to understand.
  4. It does not highlight any bank, and makes the comparison fair.