sa: load 300k--wikipedia-links.sw sa: find-inverse[links-to] sa: H |*> #=> how-many inverse-links-to merge-labels(|WP: > + |_self>) sa: S |*> #=> table[wikipage,coeff] select[1,60] 100 self-similar[inverse-links-to] merge-labels(|WP: > + |_self>) sa: S |Love> +-----------------------------------+--------+ | wikipage | coeff | +-----------------------------------+--------+ | Love | 100.0 | | Pride | 17.391 | | Pleasure | 13.043 | | Jealousy | 13.043 | | Philotes_(mythology) | 13.043 | | Imagination | 13.043 | | Pity | 13.043 | | Envy | 13.043 | | Peace | 12.121 | | Matter | 12 | | Fear | 8.696 | | Measurement | 8.696 | | Number | 8.696 | | Observation | 8.696 | | Misanthropy | 8.696 | | Piety | 8.696 | | Courage | 8.696 | | Hope | 8.696 | | Lust | 8.696 | | Asteria | 8.696 | | Orthrus | 8.696 | | Modesty | 8.696 | | Punishment | 8.696 | | Idea | 8.696 | | Politeness | 8.696 | | Learning | 8.696 | | Luck | 8.696 | | Sexual_attraction | 8.696 | | Necessity | 8.696 | | Physical_intimacy | 8.696 | | Wrath | 8.696 | | Gluttony | 8.696 | | Prediction | 8.696 | | Darkness | 8.696 | | Safety | 8.696 | | Optimism | 8.696 | | Doubt | 8.696 | | Moderation | 8.696 | | Compassion | 8.696 | | Respect | 8.696 | | Nomenclature | 8.696 | | Courtship | 8.696 | | Jonathan_Barnes | 8.696 | | DielsKranz_numbering_system | 8.696 | | John_Raven | 8.696 | | De_amore_(Andreas_Capellanus) | 8.696 | | Infatuation | 8.696 | | Category:Love | 8.696 | | Contempt | 8.696 | | Memory | 8.696 | | Quantity | 8.696 | | cyclops | 8.696 | | Curiosity | 8.696 | | Passion_(emotion) | 8.696 | | Category:Philosophy_of_love | 8.696 | | nonverbal_communication | 8.696 | | Air | 8.696 | | Neikea | 8.696 | | Peter_Kingsley_(scholar) | 8.696 | | Inquiry | 8.696 | +-----------------------------------+--------+ Time taken: 1 hour, 42 minutes, 23 seconds, 210 milliseconds sa: S |Knowledge> +----------------------------+--------+ | wikipage | coeff | +----------------------------+--------+ | Knowledge | 100.0 | | Inquiry | 16 | | Measurement | 12 | | Pride | 12 | | Idea | 12 | | Learning | 12 | | Prediction | 12 | | Experience | 12 | | Memory | 12 | | Intelligence_(trait) | 12 | | understanding | 10.345 | | Imre_Lakatos | 8.333 | | Beauty | 8 | | Outline_of_education | 8 | | Faith | 8 | | Love | 8 | | Meaning_of_life | 8 | | Metaphor | 8 | | Nominalism | 8 | | Number | 8 | | Observation | 8 | | Platonic_idealism | 8 | | Pain | 8 | | Pathological_science | 8 | | Problem_of_other_minds | 8 | | Misanthropy | 8 | | Piety | 8 | | Virtue | 8 | | Lust | 8 | | Discovery_(observation) | 8 | | Ineffability | 8 | | Belief | 8 | | Organization | 8 | | Modesty | 8 | | Placebo | 8 | | Punishment | 8 | | Quasi-empirical_method | 8 | | Pleasure | 8 | | Jealousy | 8 | | Authority | 8 | | Karl_Mannheim | 8 | | Paradigm | 8 | | Intensionality | 8 | | Problem_of_induction | 8 | | Necessity | 8 | | Elegance | 8 | | Prattyasamutpda | 8 | | Moderation | 8 | | Phenomenalism | 8 | | Nomenclature | 8 | | Potentiality_and_actuality | 8 | | Max_Scheler | 8 | | Matter | 8 | | Panpsychism | 8 | | Information | 8 | | knowledge_management | 8 | | Lev_Shestov | 8 | | Interpretation_(logic) | 8 | | Outline_of_philosophy | 8 | | Outline_of_logic | 8 | +----------------------------+--------+ Time taken: 1 hour, 48 minutes, 29 seconds, 868 milliseconds sa: H |Google> |number: 704> sa: S |Google> +---------------------------------------+--------+ | wikipage | coeff | +---------------------------------------+--------+ | Google | 100.0 | | Apple_Inc. | 14.063 | | Microsoft | 12.732 | | Facebook | 11.222 | | Yahoo! | 9.375 | | World_Wide_Web | 8.807 | | IBM | 8.093 | | Sun_Microsystems | 7.955 | | Android_(operating_system) | 7.812 | | Internet | 7.487 | | Amazon.com | 7.102 | | Intel | 6.676 | | Linux | 6.537 | | Hewlett-Packard | 6.25 | | Stanford_University | 6.108 | | Twitter | 6.108 | | web_browser | 6.108 | | HTML | 5.824 | | operating_system | 5.803 | | YouTube | 5.657 | | Forbes | 5.384 | | Massachusetts_Institute_of_Technology | 5.324 | | Java_(programming_language) | 4.83 | | AOL | 4.687 | | smartphone | 4.687 | | open_source | 4.687 | | C_(programming_language) | 4.608 | | Silicon_Valley | 4.545 | | Nokia | 4.403 | | C++ | 4.403 | | Microsoft_Windows | 4.354 | | JavaScript | 4.261 | | Wired_(magazine) | 4.261 | | Motorola | 4.119 | | XML | 4.119 | | Wall_Street_Journal | 4.119 | | CNET | 4.119 | | copyright | 4.119 | | software | 4.119 | | Oracle_Corporation | 3.977 | | Sony | 3.977 | | Unix | 3.977 | | Mac_OS_X | 3.977 | | Wikipedia | 3.977 | | Internet_Explorer | 3.835 | | OS_X | 3.835 | | source_code | 3.835 | | eBay | 3.835 | | computer_science | 3.748 | | University_of_California,_Berkeley | 3.732 | | IP_address | 3.693 | | Larry_Page | 3.693 | | iPhone | 3.693 | | algorithm | 3.693 | | free_software | 3.693 | | University_of_Michigan | 3.551 | | GNU_General_Public_License | 3.551 | | database | 3.551 | | Carnegie_Mellon_University | 3.409 | | Cisco_Systems | 3.409 | +---------------------------------------+--------+ Time taken: 1 day, 18 hours, 53 minutes, 1 second, 791 milliseconds sa: H |Blog> |number: 32> sa: S |Blog> +-----------------------------------------------------------------+-------+ | wikipage | coeff | +-----------------------------------------------------------------+-------+ | Blog | 100 | | Active_Server_Pages | 9.375 | | Desktop_publishing | 9.375 | | Online_chat | 9.375 | | CAPTCHA | 9.375 | | RSS | 9.302 | | Dynamic_HTML | 6.25 | | Malware | 6.25 | | Chat_room | 6.25 | | Content_management_system | 6.25 | | ABC_World_News_Tonight | 6.25 | | Cross-site_scripting | 6.25 | | Primetime_(TV_series) | 6.25 | | Phishing | 6.25 | | home_page | 6.25 | | Open_source_software | 6.25 | | impact_factor | 6.25 | | Terminate_and_Stay_Resident | 6.25 | | electronic_mailing_list | 6.25 | | Podcast | 6.25 | | Google_Scholar | 6.25 | | OPML | 6.25 | | feed_aggregator | 6.25 | | peer-review | 6.25 | | Social_networking_service | 6.25 | | Digg | 6.25 | | carbon_copy | 6.25 | | online_community | 6.25 | | Freemium | 6.25 | | Microsoft_Silverlight | 6.25 | | Wikia | 6.25 | | Peer-to-peer_file_sharing | 6.25 | | Fully_qualified_domain_name | 6.25 | | Category:Internet_forums | 6.25 | | Category:American_broadcast_news_analysts | 6.25 | | arXiv.org | 6.25 | | preprint | 6.25 | | Cicada_3301 | 6.25 | | fansite | 6.25 | | Affiliate_marketing | 6.25 | | Category:American_television_news_anchors | 6.25 | | Category:ABC_News_personalities | 6.25 | | Category:American_television_reporters_and_correspondents | 6.25 | | Lisa_McRee | 6.25 | | Category:Electronic_publishing | 6.25 | | Kevin_Newman_(journalist) | 6.25 | | Robin_Roberts_(sportscaster) | 6.25 | | Internet_Information_Services | 6.061 | | newsmagazine | 5.882 | | George_Stephanopoulos | 5.714 | | news_presenter | 5.714 | | FAQ | 5.556 | | Internet_meme | 5.405 | | Common_Gateway_Interface | 5.263 | | Bulletin_board_system | 5.172 | | Internet_slang | 5 | | news_anchor | 4.651 | | Document_Object_Model | 4.444 | | Staff_writer | 4.444 | | web_application | 4.348 | +-----------------------------------------------------------------+-------+ Time taken: 2 hours, 12 minutes, 50 seconds, 381 milliseconds sa: H |arXiv.org> |number: 3> sa: S |arXiv.org> +------------------------------------------------------------------------------------+--------+ | wikipage | coeff | +------------------------------------------------------------------------------------+--------+ | arXiv.org | 100 | | citation_impact | 40 | | serials_crisis | 40 | | NEC_Research_Institute | 40 | | postprint | 40 | | institutional_repository | 40 | | OAIster | 40 | | SHERPA_(organisation) | 40 | | Category:Electronic_publishing | 40 | | Paul_Ginsparg | 33.333 | | preprint | 27.273 | | self-archiving | 25 | | Category:Academic_publishing | 23.077 | | Methodological_naturalism | 20 | | Presocratics | 20 | | Cryptology_ePrint_Archive | 20 | | Open_publishing | 20 | | Hubble_diagram | 20 | | GZK_paradox | 20 | | List_of_unsolved_problems_in_physics | 20 | | Print_on_demand | 20 | | TeV | 20 | | Boundary_condition | 20 | | Black_body_radiation | 20 | | Subscriptions | 20 | | R.P._Feynman | 20 | | Citeseer | 20 | | Citation_index | 20 | | File:Solvay_conference_1927.jpg | 20 | | File:Senenmut-Grab.JPG | 20 | | bioacoustics | 20 | | pattern_formation | 20 | | University_Physics | 20 | | File:Archimedes-screw_one-screw-threads_with-ball_3D-view_animated_small.gif | 20 | | Bryn_Mawr_Classical_Review | 20 | | File:Acceleration_components.JPG | 20 | | Delayed_open-access_journal | 20 | | Astronomical_ceiling_of_Senemut_Tomb | 20 | | quantitative_finance | 20 | | File:CMS_Higgs-event.jpg | 20 | | James_Madison_Award | 20 | | Public_Knowledge_Project | 20 | | the_central_science | 20 | | Difference_between_chemistry_and_physics | 20 | | theses | 20 | | Optical_physics | 20 | | analytic_solution | 20 | | weakly_interacting_massive_particle | 20 | | superclusters | 20 | | Open_Humanities_Press | 20 | | iBooks_Author | 20 | | econophysics | 20 | | ultrasonics | 20 | | OAI-PMH | 20 | | Journal_of_Library_Administration | 20 | | File:Einstein1921_by_F_Schmutzer_2.jpg | 20 | | Ancient_Greek_poetry | 20 | | Publish_or_perish | 20 | | higher_dimension | 20 | | IBEX | 20 | +------------------------------------------------------------------------------------+--------+ Time taken: 41 minutes, 16 seconds, 954 milliseconds sa: H |Theory_of_everything> |number: 13> sa: S |Theory_of_everything> +-------------------------------------------------------------+--------+ | wikipage | coeff | +-------------------------------------------------------------+--------+ | Theory_of_everything | 100.0 | | Ultimate_fate_of_the_universe | 21.429 | | Planck_scale | 17.391 | | Big_Rip | 15.385 | | Eddington_limit | 15.385 | | Supersymmetry | 15.385 | | Arrow_of_time | 15.385 | | Dimensionless_physical_constant | 15.385 | | Plumian_Professor_of_Astronomy_and_Experimental_Philosophy | 15.385 | | Sir_Roger_Penrose | 15.385 | | Bakerian_Lecture | 15.385 | | grand_unified_theory | 15.385 | | Big_Freeze | 15.385 | | Topological_order | 15.385 | | Baryon_asymmetry | 15.385 | | Neutrino_mass | 15.385 | | Unified_field_theory | 15.385 | | Membrane_(M-theory) | 15.385 | | Static_forces_and_virtual-particle_exchange | 15.385 | | Generation_(particle_physics) | 15.385 | | Stellar_nucleosynthesis | 14.286 | | Compact_Muon_Solenoid | 13.333 | | Cosmic_inflation | 13.333 | | neutrino_oscillation | 12.5 | | Hermann_Bondi | 11.765 | | Category:Presidents_of_the_Royal_Astronomical_Society | 11.111 | | YangMills_theory | 11.111 | | anthropic_principle | 10.345 | | Dark_matter | 9.524 | | James_Watson | 9.091 | | CP_violation | 8 | | Anisotropy | 7.692 | | Antiparticle | 7.692 | | Acts | 7.692 | | Centripetal_force | 7.692 | | Graviton | 7.692 | | Gluon | 7.692 | | Hydrogen_atom | 7.692 | | Liquid_crystal | 7.692 | | Main_sequence | 7.692 | | Morphogenesis | 7.692 | | Panspermia | 7.692 | | Proton_decay | 7.692 | | Qubit | 7.692 | | Tokamak | 7.692 | | Quintessence_(physics) | 7.692 | | Sonoluminescence | 7.692 | | Gravitational_lens | 7.692 | | High-temperature_superconductor | 7.692 | | Fact | 7.692 | | Timeline_of_gravitational_physics_and_relativity | 7.692 | | Timeline_of_stellar_astronomy | 7.692 | | List_of_astronomers | 7.692 | | Astrophysicist | 7.692 | | Triple-alpha_process | 7.692 | | Religious | 7.692 | | Quark_matter | 7.692 | | Gravity_assist | 7.692 | | Theory_of_Everything | 7.692 | | Color_confinement | 7.692 | +-------------------------------------------------------------+--------+ Time taken: 1 hour, 8 minutes, 42 seconds, 470 millisecondsOK. Some cool results in there. Actually, I think they are amazing! I think I have done enough examples of this now.
Though maybe I should note, that the bigger the number H returns, the better the result. Which presumably means if we used even more of wikipedia, we would get even better results! And brings to mind the question, how many wikipages do we need to know more than the average human?
BTW, I don't think I have linked to this yet, the full wikipedia link structure in sw notation. bzip2 down to about 2 GB I seem to recall.
No comments:
Post a Comment