Tuesday 3 February 2015

African capital cities and population in table format

Bah! Only got part-way through this one, but I'll post what I have. I wanted to load up some data to pretty print a table of African countries and their capital cities. So I had that working, and then thought why not population too? This time from wikipedia. Problem is the two data-sets have some differences in naming countries (yeah, I guess this is going to be a problem in general for the semantic web). Then I got greedy and thought why not a more comprehensive data set for Africa. Looked again at wikipedia, gave it some thought, and changed my mind. Too much work. Almost need to write a custom parser. Too lazy for that. So I gave up. Here is what I do have:
sa: load africa.sw
sa: table[name,capital-city,population] "" |Africa: country: list>
+---------------------------------------+--------------------+-----------------------+
| name                                  | capital-city       | population            |
+---------------------------------------+--------------------+-----------------------+
| country: Algeria                      | city: Algiers      | population: 39903000  |
| country: Angola                       | city: Luanda       | population: 25326000  |
| country: Benin                        | city: Porto-Novo   | population: 10750000  |
| country: Botswana                     | city: Gaborone     | population: 2176000   |
| country: Burkina Faso                 | city: Ouagadougou  | population: 18477000  |
| country: Burundi                      | city: Bujumbura    | population: 9824000   |
| country: Cameroon                     | city: Yaounde      | population: 21918000  |
| country: Cape Verde                   | city: Praia        | population: 525000    |
| country: Central African Republic     | city: Bangui       | population: 5545000   |
| country: Chad                         | city: N'Djamena    | population: 13675000  |
| country: Cote d'Ivoire                | city: Yamoussoukro |                       |
| country: Democratic Republic of Congo | city: Kinshasa     |                       |
| country: Egypt                        | city: Cairo        | population: 88523000  |
| country: Equatorial Guinea            | city: Malabo       | population: 1996000   |
| country: Eritrea                      | city: Asmara       | population: 6895000   |
| country: Ethiopia                     | city: AddisAbaba   | population: 90076000  |
| country: Gabon                        | city: Libreville   | population: 2382000   |
| country: Ghana                        | city: Accra        | population: 27714000  |
| country: Kenya                        | city: Nairobi      | population: 44153000  |
| country: Lesotho                      | city: Maseru       | population: 1908000   |
| country: Liberia                      | city: Monrovia     | population: 4046000   |
| country: Libya                        | city: Tripoli      | population: 6521000   |
| country: Madagascar                   | city: Antananarivo | population: 23053000  |
| country: Malawi                       | city: Lilongwe     | population: 16307000  |
| country: Mali                         | city: Bamako       | population: 17796000  |
| country: Mauritania                   | city: Nouakchott   | population: 3632000   |
| country: Mauritius                    | city: Port Louis   | population: 1263000   |
| country: Morocco                      | city: Rabat        | population: 33656000  |
| country: Mozambique                   | city: Maputo       | population: 25728000  |
| country: Niger                        | city: Niamey       | population: 18880000  |
| country: Nigeria                      | city: Abuja        | population: 185043000 |
| country: Republic of Congo            | city: Brazzaville  |                       |
| country: Republic of Djibouti         | city: Djibouti     |                       |
| country: Republic of Guinea           | city: Conakry      |                       |
| country: Republic of Namibia          | city: Windhoek     |                       |
| country: Republic of South Sudan      | city: Juba         |                       |
| country: Republic of Sudan            | city: Khartoum     |                       |
| country: Republic of Tunisia          | city: Tunis        |                       |
| country: Rwanda                       | city: Kigali       | population: 11324000  |
| country: Sao Tome and Principe        | city: Sao Tome     |                       |
| country: Senegal                      | city: Dakar        | population: 14150000  |
| country: Seychelles                   | city: Victoria     | population: 97000     |
| country: Sierra Leone                 | city: Freetown     | population: 6513000   |
| country: Somalia                      | city: Mogadishu    | population: 10972000  |
| country: South Africa                 | city: Pretoria     | population: 54844000  |
| country: Swaziland                    | city: Mbabane      | population: 1097000   |
| country: Tanzania                     | city: Dodoma       | population: 48829000  |
| country: The Gambia                   | city: Banjul       |                       |
| country: Togo                         | city: Lome         | population: 7065000   |
| country: Uganda                       | city: Kampala      | population: 35760000  |
| country: Union of Comoros             | city: Moroni       |                       |
| country: Western Sahara               | city: El Aaiun     |                       |
| country: Zambia                       | city: Lusaka       | population: 15474000  |
| country: Zimbabwe                     | city: Harare       | population: 13503000  |
+---------------------------------------+--------------------+-----------------------+
And I guess that is it for now.

Update: if we don't want "country: ", "city: " and "population: " in there we have to do a dance. At first guess you (I certainly did) might try "extract-value" applied to the incoming superposition, to strip the "country: " prefix. But here is what happens (top 5 results for brevity sake):
sa: table[country,capital-city,population] extract-value select[1,5] "" |Africa: country: list>
+--------------+--------------+------------+
| country      | capital-city | population |
+--------------+--------------+------------+
| Algeria      |              |            |
| Angola       |              |            |
| Benin        |              |            |
| Botswana     |              |            |
| Burkina Faso |              |            |
+--------------+--------------+------------+
So why did this happen? Because we have:
capital-city |country: Algeria> => |city: Algiers>
population |country: Algeria> => |population: 39903000>
but we have no knowledge of "capital-city" and "population" applied to kets without the "country: " data-type. ie, these are undefined:
capital-city |Algeria> => ...
population |Algeria> => ...
So this is where the dancing kicks in (NB: the merge-labels() that re-inserts the "country: " data-type):
capital |*> #=> extract-value capital-city merge-labels(|country: > + |_self>)
popn |*> #=> to-comma-number extract-value population merge-labels(|country: > + |_self>)
And now the tidied table:
sa: table[country,capital,popn] extract-value "" |Africa: country: list>
+------------------------------+--------------+-------------+
| country                      | capital      | popn        |
+------------------------------+--------------+-------------+
| Algeria                      | Algiers      | 39,903,000  |
| Angola                       | Luanda       | 25,326,000  |
| Benin                        | Porto-Novo   | 10,750,000  |
| Botswana                     | Gaborone     | 2,176,000   |
| Burkina Faso                 | Ouagadougou  | 18,477,000  |
| Burundi                      | Bujumbura    | 9,824,000   |
| Cameroon                     | Yaounde      | 21,918,000  |
| Cape Verde                   | Praia        | 525,000     |
| Central African Republic     | Bangui       | 5,545,000   |
| Chad                         | N'Djamena    | 13,675,000  |
| Cote d'Ivoire                | Yamoussoukro |             |
| Democratic Republic of Congo | Kinshasa     |             |
| Egypt                        | Cairo        | 88,523,000  |
| Equatorial Guinea            | Malabo       | 1,996,000   |
| Eritrea                      | Asmara       | 6,895,000   |
| Ethiopia                     | AddisAbaba   | 90,076,000  |
| Gabon                        | Libreville   | 2,382,000   |
| Ghana                        | Accra        | 27,714,000  |
| Kenya                        | Nairobi      | 44,153,000  |
| Lesotho                      | Maseru       | 1,908,000   |
| Liberia                      | Monrovia     | 4,046,000   |
| Libya                        | Tripoli      | 6,521,000   |
| Madagascar                   | Antananarivo | 23,053,000  |
| Malawi                       | Lilongwe     | 16,307,000  |
| Mali                         | Bamako       | 17,796,000  |
| Mauritania                   | Nouakchott   | 3,632,000   |
| Mauritius                    | Port Louis   | 1,263,000   |
| Morocco                      | Rabat        | 33,656,000  |
| Mozambique                   | Maputo       | 25,728,000  |
| Niger                        | Niamey       | 18,880,000  |
| Nigeria                      | Abuja        | 185,043,000 |
| Republic of Congo            | Brazzaville  |             |
| Republic of Djibouti         | Djibouti     |             |
| Republic of Guinea           | Conakry      |             |
| Republic of Namibia          | Windhoek     |             |
| Republic of South Sudan      | Juba         |             |
| Republic of Sudan            | Khartoum     |             |
| Republic of Tunisia          | Tunis        |             |
| Rwanda                       | Kigali       | 11,324,000  |
| Sao Tome and Principe        | Sao Tome     |             |
| Senegal                      | Dakar        | 14,150,000  |
| Seychelles                   | Victoria     | 97,000      |
| Sierra Leone                 | Freetown     | 6,513,000   |
| Somalia                      | Mogadishu    | 10,972,000  |
| South Africa                 | Pretoria     | 54,844,000  |
| Swaziland                    | Mbabane      | 1,097,000   |
| Tanzania                     | Dodoma       | 48,829,000  |
| The Gambia                   | Banjul       |             |
| Togo                         | Lome         | 7,065,000   |
| Uganda                       | Kampala      | 35,760,000  |
| Union of Comoros             | Moroni       |             |
| Western Sahara               | El Aaiun     |             |
| Zambia                       | Lusaka       | 15,474,000  |
| Zimbabwe                     | Harare       | 13,503,000  |
+------------------------------+--------------+-------------+
And one more for luck! Top 10 countries in Africa sorted by population, in a rank table, with "country: ", "city: ", and "population: " removed:
sa: rank-table[country,capital,popn] select[1,10] extract-value reverse sort-by[population] "" |Africa: country: list> 
+------+--------------+------------+-------------+
| rank | country      | capital    | popn        |
+------+--------------+------------+-------------+
| 1    | Nigeria      | Abuja      | 185,043,000 |
| 2    | Ethiopia     | AddisAbaba | 90,076,000  |
| 3    | Egypt        | Cairo      | 88,523,000  |
| 4    | South Africa | Pretoria   | 54,844,000  |
| 5    | Tanzania     | Dodoma     | 48,829,000  |
| 6    | Kenya        | Nairobi    | 44,153,000  |
| 7    | Algeria      | Algiers    | 39,903,000  |
| 8    | Uganda       | Kampala    | 35,760,000  |
| 9    | Morocco      | Rabat      | 33,656,000  |
| 10   | Ghana        | Accra      | 27,714,000  |
+------+--------------+------------+-------------+
And I think that is it for this post. I'm slowly learning the best way to interact with tables.

Update: the table code now auto-applies "extract-value" operator to table elements. Now, we no longer need to "do a dance" to remove category text cluttering up our tables. Now simply:
sa: load africa.sw
sa: popn |*> #=> to-comma-number population |_self>
sa: rank-table[country,capital-city,popn] select[1,10] reverse sort-by[population] "" |Africa: country: list>
+------+--------------+--------------+-------------+
| rank | country      | capital-city | popn        |
+------+--------------+--------------+-------------+
| 1    | Nigeria      | Abuja        | 185,043,000 |
| 2    | Ethiopia     | AddisAbaba   | 90,076,000  |
| 3    | Egypt        | Cairo        | 88,523,000  |
| 4    | South Africa | Pretoria     | 54,844,000  |
| 5    | Tanzania     | Dodoma       | 48,829,000  |
| 6    | Kenya        | Nairobi      | 44,153,000  |
| 7    | Algeria      | Algiers      | 39,903,000  |
| 8    | Uganda       | Kampala      | 35,760,000  |
| 9    | Morocco      | Rabat        | 33,656,000  |
| 10   | Ghana        | Accra        | 27,714,000  |
+------+--------------+--------------+-------------+
I'm always in favour of minimising how much work you need to do. So I think this is a big improvement! No dancing just to tidy up the table a little.

No comments:

Post a Comment