IS2610 Assignment #1 due by 6PM Sept 15, 2005 As disscussed in class this evening your assignment for next week is to use the weighted quick union program 1.3 from page 17 as part of the Kruskal method for producing a minimal spanning tree. We will use the latitude and longitude coordinates of U.S. cities as a test case. The text form of the data (originally from http://www.realestate3d.com/gps/latlong.htm) is on my class web page as http://www.psc.edu/~awetzel/latlong.txt A smaller 7 city subset that is more suitable for your debugging work is in http://www.psc.edu/~awetzel/smalltest.txt. The data format is ... 40.65 75.43 Allentown,PA 42.08 80.18 Erie,PA 40.22 76.85 Harrisburg,PA 40.13 76.30 Lancaster,PA 40.28 79.40 Latrobe,PA 39.88 75.25 Philadelphia,PA 40.50 80.22 Pittsburgh,PA ... where the two numbers are the latitude and longitude coordinates that we will use for our computations. We will ignore the curvature of the earth and the contraction of the longitude lines and pretend that these coordinates can be directly used as equal x and y positions for the purpose of computing city to city distances using the Pythagorean principal. (ie. the sum of the square of the hypotenuse is equal to the sum of the squares of the two right angle edges of a triangle.) Our solution will come from a command pipeline like this ... gendist < latlong.txt | sort -n | connect | span2ps > spantree.ps Your part of the assignment is to adapt my template codes (available on the class web page) to make the gendist and connect programs. The span2ps program does not need to be changed and sort is a standard utility. Your part of the assignment is really quite small and is intended more to help you develop your initial C/Unix skills rather than to perform any difficult computation. Your first task is to modity the template www.psc.edu/~awetzel/gendist.c to print a complete list of all of the city to city distances in a form that can be used by the later steps in the process. The essential thing is to print all of all of the information on the standard output (thats where printf goes) and to carry along the city line numbers and coordinates so that each printf line represents one potential city to city connection. My gendist output for the smalltest file looks like this ... 4.960585 1 0 42.080002 80.180000 40.650002 75.430000 1.483676 2 0 40.220001 76.849998 40.650002 75.430000 3.814251 2 1 40.220001 76.849998 42.080002 80.180000 1.013561 3 0 40.130001 76.300003 40.650002 75.430000 4.342451 3 1 40.130001 76.300003 42.080002 80.180000 0.557311 3 2 40.130001 76.300003 40.220001 76.849998 3.987206 4 0 40.279999 79.400002 40.650002 75.430000 1.961736 4 1 40.279999 79.400002 42.080002 80.180000 2.550709 4 2 40.279999 79.400002 40.220001 76.849998 3.103625 4 3 40.279999 79.400002 40.130001 76.300003 0.790760 5 0 39.880001 75.250000 40.650002 75.430000 5.398602 5 1 39.880001 75.250000 42.080002 80.180000 1.635725 5 2 39.880001 75.250000 40.220001 76.849998 1.079355 5 3 39.880001 75.250000 40.130001 76.300003 4.169234 5 4 39.880001 75.250000 40.279999 79.400002 4.792349 6 0 40.500000 80.220001 40.650002 75.430000 1.580508 6 1 40.500000 80.220001 42.080002 80.180000 3.381615 6 2 40.500000 80.220001 40.220001 76.849998 3.937421 6 3 40.500000 80.220001 40.130001 76.300003 0.848999 6 4 40.500000 80.220001 40.279999 79.400002 5.008524 6 5 40.500000 80.220001 39.880001 75.250000 If you are working with N cities then there will be N*(N-1)/2 paths. Therefore, for the 7 city smalltest there are 21 lines that will feed into the sort process. Sort will rearrange these lines and output them in order of their connection lengths with the shortest paths first. This sorted data feeds into the connect program which is just an adaptation of the same program 1.3 from page 17 that you worked with last week. (I have a version of that as http://www.psc.edu/~awetzel/connect.c) Your task is to modify connect.c to carry along the coordinate information in its output so that we can generate a 2-dimensional graph at the end of the pipeline. The city line numbers from the input to gendist (that are the 2nd and 3rd colums above) will serve as the p and q values that connect needs in order to do its job which is to throw away all of the redundant connections and only pass along the connections that will be part of the final spanning tree. My example output from connect for the small test looks like this ... 0.557311 3 2 40.130001 76.300003 40.220001 76.849998 0.790760 5 0 39.880001 75.250000 40.650002 75.430000 0.848999 6 4 40.500000 80.220001 40.279999 79.400002 1.013561 3 0 40.130001 76.300003 40.650002 75.430000 1.580508 6 1 40.500000 80.220001 42.080002 80.180000 2.550709 4 2 40.279999 79.400002 40.220001 76.849998 Notice that there are 6 connections to connect the 7 cities. The span2ps program simply looks at the coordinates to draw the connection lines in a crude printable PostScript file. All you have to turn in to me is your modified source code for gendist.c and connect.c. Just email them to awetzel@psc.edu as plain text or as an attachment. Please do NOT send doc files, html files or your compiled binaries. I will have a script that recompiles and tests your programs. Please keep in touch by email if you have problems especially at this early stage of the term while you are getting accustomed to C & Unix. Art