Browsing articles tagged with " Study"
十 5, 2008
Fred

XP and ubuntu双系统默认启动

这个很久之前看过,不过很快就忘了具体是哪个文件了,这脑子。。。
电脑的情况:现状的XP,后来有通过硬盘安装的ubuntu,用的是grup,现在在c盘根目录还有这些文件。在安装ubuntu之后,电脑默认的开机启动就是ubuntu系统了,这个时候如果像要更改默认设置,需要在ubuntu下运行以下语句:
sudo gedit /boot/grub/menu.lst
对弹出的文档进行修改保存,在里面找到类似的语句:default 0,0是开机启动时默认的启动系统所在的行,从第0行开始,开机的时候煮一下xp是第几行(0开始),然后改过来保存重启。另外,附近还有一个delay之类的好像,那是修改默认的选择时间的(秒)。
注意:ubuntu升级后,会出现位置变化的情况,我的默认就成了mentest了。一看原来是xp所在的行变了。

十 1, 2008
Fred

ubuntu下lucene和nutch的配置

这篇是对上面配置的继续
1.安装lucene
wget http://apache.mirror.phpchina.com/lucene/java/lucene-2.3.2.tar.gz
不是lucene-2.3.2-src.tar.gz哦,这个无lucene-demos-2.3.2.jar
在目录下:
tar zxvf lucene-2.3.2.tar.gz
mv lucene-2.3.2 /usr/share

查看解压文件里面的build.txt里面有lucene配置的的基本步骤,根据提示,ant是必须的,我们现在要把ant的安装补上(eclipse下有ant,如果在eclipse下调试的话好像不需要再安装了就。具体的不太清楚也)
##CONTINUE##
2.安装ant

http://ant.apache.org/bindownload.cgi

ant是一个基于JAVA的自动化脚本引擎,脚本格式为XML。除了做JAVA编译相关任务外,ANT还可以通过插件实现很多应用的调用,比make脚本来说还要好维护一些。

wget http://apache.mirror.phpchina.com/ant/binaries/apache-ant-1.7.1-bin.tar.gz

在对应目录下:
tar zxvf apache-ant-1.7.1-bin.tar.gz

mv apache-ant-1.7.0 /usr/share/

gedit /etc/profile

加上
ANT_HOME=/usr/share/apache-ant-1.7.1
export ANT_HOME
编辑
PATH=$PATH:$JAVA_HOME/bin:$ANT_HOME/bin

3 继续设置lucene
再修改profile
gedit /etc/profile
增加
LUCENE_HOME=/usr/share/lucene-2.3.2
CLASSPATH=.:${JAVA_HOME}/lib/dt.jar:${JAVA_HOME}/lib/tools.jar:${LUCENE_HOME}/lucene-core-2.3.2.jar:${LUCENE_HOME}/lucene-demos-2.3.2.jar

4.测试lucene demo
在lucene目录下,
cd ./src/demo
java org.apache.lucene.demo.IndexFiles /usr/share/lucene-2.3.2/docs

如果路径正确会出现如下:



adding /usr/share/lucene-2.3.2/docs/demo.pdf
adding /usr/share/lucene-2.3.2/docs/demo2.html
adding /usr/share/lucene-2.3.2/docs/gettingstarted.html
adding /usr/share/lucene-2.3.2/docs/fileformats.pdf
adding /usr/share/lucene-2.3.2/docs/scoring.html
adding /usr/share/lucene-2.3.2/docs/linkmap.html
adding /usr/share/lucene-2.3.2/docs/lucene-sandbox/index.html
adding /usr/share/lucene-2.3.2/docs/lucene-sandbox/index.pdf
adding /usr/share/lucene-2.3.2/docs/queryparsersyntax.pdf
adding /usr/share/lucene-2.3.2/docs/linkmap.pdf
adding /usr/share/lucene-2.3.2/docs/demo4.html
adding /usr/share/lucene-2.3.2/docs/benchmarktemplate.xml
adding /usr/share/lucene-2.3.2/docs/index.pdf
Optimizing…
17869 total milliseconds
并生成index文件夹。
然后就可以搜索了,
输入以下命令就会出现搜索提示符。
java org.apache.lucene.demo.SearchFiles
按提示输入即可。

如果要在tomcat下测试lucene demo可按如下操作:
4.1.将lucene目录下的/src/jsp目录复制到tomcat的webapps目录下,该名为luceneweb
4.2.复制lucene-core-2.3.2.jar和lucene-demos-2.3.2.jar到luceneweb/WEB_INF/lib目录下(或将luceneweb.war复制到tomcat的webapps目录下)没仔细试,好像不行

4.3.将index目录复制到luceneweb目录下
4.4.编辑luceneweb下的configuration.jsp,设置String indexLocation = “/usr/share/tomcat6/webapps/luceneweb/index”;String appfooter也可以自行设置
4.5.重启tomcat,访问http://localhost:8080/luceneweb/,就可以看到lucene demo的界面了,可以试着搜索下看看~

5.安装Nutch
可以去Apache的官方网页http://www.apache.org/dyn/closer.cgi/lucene/nutch/ 下载最新版的Nutch,目前最新版是nutch-0.9
解压缩到目标文件夹即可。
Nutch和tomcat,eclipse的是接下来要做的工作了

十 1, 2008
Fred

ubuntu下jdk+eclipse+tomcat的配置

前几天把ubuntu下的一些工具配置了一下,真的是够费事的,现在写写具体的过程,给自己做个参照 ,也希望能给有需要的提供下信息吧。目前,我把开发环境搭建如下:ubuntu8.04 +jdk1.6.0_06 +Eclipse3.2.2 +Tomcat6.0.18 +Lucene2.3.2 +Nutch0.9,不过昨天把ubuntu升级了一下,现在发现jdk已经升级到1.6.0_07了,现在不知道会不会有影响。。。
现说说jdk,eclipse,tomcat的配置情况吧。基本上是参考这两篇:[原创]Ubuntu 7.10 J2EE开发环境lomboz+eclipse3.2.1+tomcat5.5.25+mysql5.ubuntu 8.04 J2EE 开发环境 eclipse 3.4汉化+tomcat+mysql+oracle+三大框架
##CONTINUE##
//***************************************//
1. jdk
首先安装JDK,sudo apt-get install sun-java6-jre sun-java6-jdk sun-java6-demo sun-java6-doc sun-java6-source sun-java6-plugin sun-java6-fonts libmyodbc tdsodb
此将JAVA环境安装全,包括实例和API文档,需要说明的是API文档需要另下载,照提示把API文档放到/tmp目录下,需要提醒的是,在安装过程会有一个确定,按TAB键,选择确定 (doc去官网上下就可以了)
下面是关键的步骤:

环境变量的设置:
sudo gedit /etc/profile //这个配制文件就是一个纯文本的东西,用什么工具自己选择vi、vim、gvim、emacs什么的都可以,只要是你机器里有的文本编辑工具就可以
JAVA_HOME=/usr/lib/jvm/java-6-sun
CLASSPATH=.:/usr/lib/jvm/java-6-sun/lib
JRE_HOME=/usr/lib/jvm/java-6-sun/jre
export JRE_HOME
export CLASSPATH
export JAVA_HOME
要特别注意位置是文章未尾umask 022之前

sudo gedit /etc/environment
JAVA_HOME=/usr/lib/jvm/java-6-sun
CLASSPATH=.:/usr/lib/jvm/java-6-sun/lib
提示:如果要让它立刻生效可以在终端里输入: . /etc/environment
一般情况可以不做设置,因为在这些安装完成后,机器必须得重启下,所没必要设置,但如果只是设置一下
环境变量就要用它的话,就可以采用这种方法。

sudo gedit /etc/jvm
在文件顶部添加
/usr/lib/jvm/java-6-sun
提示:这一步是在设置JDK的优先级

sudo gedit ~/.bashrc
在文件的末尾,添加如下两行
export JAVA_HOME=/usr/lib/jvm/java-6-sun
export PATH=$PATH:$JAVA_HOME/bin
提示:这是在设置用户变量

如果您的机器里装着多个JDK用下面的方法来选择JDK版本(8.04默认是没有安装JDK的,上面的安装只安装了一个版本的JDK,可以不设置)
sudo update-alternatives –config java

更新当前系统的JDK版本
sudo update-java-alternatives -s java-6-sun

clfour:因为ubuntu刚装的时候已经配置了一些,所以第一步相对简单了许多,只是又去下了一下文档。
//***************************************//
2.eclipse的安装:
eclipse的安装,您可以自己选择,可以用ubuntu提供的版本,也可以自己到eclipse官方网站下载
sudo atp-get install eclipse //ubuntu提供的版本
http://www.eclipse.org 自行下载的版本(当前是3.4)

sudo atp-get install eclipse
设置eclipse
sudo gedit /etc/eclipse/java_home
在文件顶部添加
/usr/lib/jvm/java-6-sun
这样eclipse就可以找到jre环境了

clfour:eclipse之前也已经装过了,是ubuntu提供的3.2版本,也没有在换。

//***************************************//
3.tomcat 安装:
到http://tomcat.apache.org/下载tomcat6.0.18,解压放到/usr/share/目录下。
到http://www.eclipsetotale.com/下载tomcat插件,解压放到eclipse目录下的plugins目录下,在eclipse就能看到小猫了

说明:下载的tomcat6.0.18也可以放到别的目录下,自己选择好了,我给/usr单独进行了分区,所以我的软件全装到了这个目录下,我将包改了名,叫tomcat6.0,您可以不改,
在下面的环境变量用您的包名就可以了

配置环境变量 :
sudo gedit /etc/profile
输入
CATALINA_HOME=/usr/share/tomcat6.0
export CATALINA_HOME

重启

运行 tomcat
运行 前先输入http://localhost:8080/
看看tomcat是否已经启动

进入tomcat的目录下面
sudo ./catalina.sh run

打开浏览器在地址栏里输入:http://localhost:8080/
看到那只小猫就OK了

关闭tomcat

//***************************************//
eclipse+tomcat
打开eclipse,windows-preferences-tomcat-选择version tomcat6.x,下面设置tomcat home 为tomcat所在目录,OK可以试试了
从eclipse的工具栏按钮中启动tomcat。
如果得到如下提示:
org.apache.catalina.startup.Catalina load
warning: Can’t load server.xml from /usr/share/tomcat6/conf/server.xml
查看$CATLINA_HOME/conf/server.xml,root用户具有读写权限,其他用户无任何权限,chmod o+r server.xml,添加读权限。

clfour:最后eclipse+tomcat,这一部分感觉还有点问题

九 19, 2008
Fred

Anchor Text-链接锚文本

Wikipedia:The anchor text or link label is the visible, clickable text in a hyperlink. The words contained in the Anchor text can determine the ranking that page will receive by search engines.

Anchor Text 非常重要。通过一个简单实验,可以深刻理解这个重要性。
http://www.google.com/ 中搜索 “click here” ,我们发现,排在搜索结果第一页第一位的是http://www.adobe.com/ 的网页,下面几位是 http://www.xe.com/http://www.apple.com/http://www.microsoft.com/ 等(这几个的PR值均为9或10,过会儿去看看)。
这几个网站的页面内都不包含 “click here” 这个关键词,那为什么他们排到前几位去了呢?
原因是:为数众多的网页以 “click here” 为链接锚文本指向这几个网站。

现在去看看吧:Google

九 19, 2008
Fred

Lucene:基于Java的全文检索引擎简介

Lucene:基于Java的全文检索引擎简介
请点击查看原文。##CONTINUE##

Lucene是一个基于Java的全文索引工具包。
基于Java的全文索引引擎Lucene简介:关于作者和Lucene的历史
全文检索的实现:Luene全文索引和数据库索引的比较
中文切分词机制简介:基于词库和自动切分词算法的比较
具体的安装和使用简介:系统结构介绍和演示
Hacking Lucene:简化的查询分析器,删除的实现,定制的排序,应用接口的扩展
从Lucene我们还可以学到什么

参考资料:
Apache: Lucene Project
http://jakarta.apache.org/lucene/
Lucene开发/用户邮件列表归档
http://www.mail-archive.com/lucene-dev@jakarta.apache.org/
http://www.mail-archive.com/lucene-user@jakarta.apache.org/
The Lucene search engine: Powerful, flexible, and free
http://www.javaworld.com/javaworld/jw-09-2000/jw-0915-Lucene_p.html
Lucene Tutorial
http://www.darksleep.com/puff/lucene/lucene.html
Notes on distributed searching with Lucene
http://home.clara.net/markharwood/lucene/
中文语言的切分词
http://www.google.com/search?sourceid=navclient&hl=zh-CN&q=chinese+word+segment
搜索引擎工具介绍
http://searchtools.com/
Lucene作者Cutting的几篇论文和专利
http://lucene.sourceforge.net/publications.html
Lucene的.NET实现:dotLucene
http://sourceforge.net/projects/dotlucene/
Lucene作者Cutting的另外一个项目:基于Java的搜索引擎Nutch
http://www.nutch.org/
http://sourceforge.net/projects/nutch/
关于基于词表和N-Gram的切分词比较
http://china.nikkeibp.co.jp/cgi-bin/china/news/int/int200302100112.html
2005-01-08 Cutting在Pisa大学做的关于Lucene的讲座:非常详细的Lucene架构解说
特别感谢:前网易CTO许良杰(Jack Xu)给我的指导:是您将我带入了搜索引擎这个行业。

九 19, 2008
Fred

Nutch介绍[转自Nutch中文网站]

Nutch 是一个开源Java 实现的搜索引擎。它提供了我们运行自己的搜索引擎所需的全部工具。可以为什么我们需要建立自己的搜索引擎呢?毕竟我们已经有google可以使用。这里我列出3点原因: ##CONTINUE##

透明度:Nutch是开放源代码的,因此任何人都可以查看他的排序算法是如何工作的。商业的搜索引擎排序算法都是保密的,我们无法知道为什么搜索出来的排序结果是如何算出来的。更进一步,一些搜索引擎允许竞价排名,比如百度,这样的索引结果并不是和站点内容相关的。因此 Nutch 对学术搜索和政府类站点的搜索来说,是个好选择。因为一个公平的排序结果是非常重要的。
对搜索引擎的理解:我们并没有google的源代码,因此学习搜索引擎Nutch是个不错的选择。了解一个大型分布式的搜索引擎如何工作是一件让人很受益的事情。在写Nutch的过程中,从学院派和工业派借鉴了很多知识:比如:Nutch的核心部分目前已经被重新用 Map Reduce 实现了。看过开复演讲的人都知道 Map Reduce 的一点知识吧。Map Reduce 是一个分布式的处理模型,最先是从 Google 实验室提出来的。你也可以从下面获得更多的消息。

http://www.domolo.com/bbs/list.asp?boardid=29

http://domolo.oicp.net/bbs/list.asp?boardid=29

并且 Nutch 也吸引了很多研究者,他们非常乐于尝试新的搜索算法,因为对Nutch 来说,这是非常容易实现扩展的。
扩展性:你是不是不喜欢其他的搜索引擎展现结果的方式呢?那就用 Nutch 写你自己的搜索引擎吧。 Nutch 是非常灵活的:他可以被很好的客户订制并集成到你的应用程序中:使用Nutch 的插件机制,Nutch 可以作为一个搜索不同信息载体的搜索平台。当然,最简单的就是集成Nutch到你的站点,为你的用户提供搜索服务。
Nutch 的安装分为3个层次:基于本地文件系统,基于局域网,或者基于 internet 。不同的安装方式具有不同的特色。比如:索引一个本地文件系统相对于其他两个来说肯定是要稳定多了,因为没有 网络错误也不同缓存文件的拷贝。基于Internet 的搜索又是另一个极端:抓取数以千计的网页有很多技术问题需要解决:我们从哪些页面开始抓取?我们如何分配抓取工作?何时需要重新抓取?我们如何解决失效的链接,没有响应的站点和重复的内容?还有如何解决对大型数据的上百个并发访问?搭建这样一个搜索引擎是一笔不小的投资呀!在 ” Building Nutch: Open Source Search,” 的作者 Mike Cafarella 和 Doug Cutting 总结如下::

… 一个具有完全功能的搜索系统:1亿页面索引量,每秒2个并发索引,需要每月800美元。10亿页面索引量,每秒50个页面请求,大概需要每月30000美元。

这篇文章将为你演示如何在中等级别的网站上搭建Nutch。第一部分集中在抓取上。Nutch的抓取架构,如何运行一个抓取程序,理解这个抓取过程产生了什么。第二部分关注搜索。演示如何运行Nutch搜索程序。以及如何订制Nutch 。

Nutch Vs. Lucene

Nutch 是基于 Lucene的。Lucene为 Nutch 提供了文本索引和搜索的API。一个常见的问题是;我应该使用Lucene还是Nutch?最简单的回答是:如果你不需要抓取数据的话,应该使用Lucene。常见的应用场合是:你有数据源,需要为这些数据提供一个搜索页面。在这种情况下,最好的方式是直接从数据库中取出数据并用Lucene API建立索引。中文用户,可以参考 WebLucene 或者 车东 的一些列文章。如果需要中文分词帮助还可以联系作者。 http://domolo.oicp.net/bbs/list.asp?boardid=24 Erik Hatcher 和 Otis Gospodnetić’s 的 Lucene in Action 中详细讲述了这个过程。Nutch 适用于你无法直接获取数据库中的网站,或者比较分散的数据源的情况下使用。

架构

总体上Nutch可以分为2个部分:抓取部分和搜索部分。抓取程序抓取页面并把抓取回来的数据做成反向索引,搜索程序则对反向索引搜索回答用户的请求。抓取程序和搜索程序的接口是索引。两者都使用索引中的字段。()

实际上搜索程序和抓取程序可以分别位于不同的机器上。()

这里我们先看看Nutch的抓取部分。

抓取程序:

抓取程序是被Nutch的抓取工具驱动的。这是一组工具,用来建立和维护几个不同的数据结构: web database, a set of segments, and the index。下面我们逐个解释上面提到的3个不同的数据结构。

The web database, 或者WebDB, 是一个特殊存储数据结构,用来映像被抓取网站数据的结构和属性的集合。WebDB 用来存储从抓取开始(包括重新抓取)的所有网站结构数据和属性。WebDB 只是被 抓取程序使用,搜索程序并不使用它。WebDB 存储2种实体:页面 和 链接。页面 表示 网络上的一个网页,这个网页的Url作为标示被索引,同时建立一个对网页内容的MD5 哈希签名。跟网页相关的其它内容也被存储,包括:页面中的链接数量(外链接),页面抓取信息(在页面被重复抓取的情况下),还有表示页面级别的分数 score 。链接 表示从一个网页的链接到其它网页的链接。因此 WebDB 可以说是一个网络图,节点是页面,链接是边。

Segment 是 网页 的集合,并且它被索引。 Segment 的 Fetchlist 是抓取程序使用的 url 列表 , 它是从 WebDB中生成的。Fetcher 的输出数据是从 fetchlist 中抓取的网页。Fetcher 的输出数据先被反向索引,然后索引后的结果被存储在segment 中。 Segment 的生命周期是有限制的,当下一轮抓取开始后它就没有用了。默认的 重新抓取间隔是30天。因此删除超过这个时间期限的segment是可以的。而且也可以节省不少磁盘空间。Segment 的命名是 日期加时间 ,因此很直观的可以看出他们的存活周期。

索引库 是 反向索引所有系统中被抓取的页面,他并不直接从页面反向索引产生,它是合并很多小的 segment 的索引中产生的。Nutch 使用 Lucene 来建立索引,因此所有 Lucene 相关的工具 API 都用来建立索引库。需要说明的是 Lucene 的 segment 的概念 和 Nutch 的 segment 概念是完全不同的,不要混淆哦。 可以参考 车东 的相关文章。 www.chedong.com 简单来说 Lucene 的 segment 是 Lucene 索引库的一部分,而 Nutch 的 Segment 是 WebDB 中 被 抓取和索引的一部分。

九 18, 2008
Fred

The Best Tools for Visualization[ZZ]

##CONTINUE##
介绍了以下可视化工具,这个是中文翻译(by 帕兰映像)的地址:http://parandroid.com/magic-number-of-100-visualization-technology-application/
Written by Sarah Perez / March 13, 2008 9:25 AM / 48 Comments« Prior Post Next Post »

Visualization is a technique to graphically represent sets of data. When data is large or abstract, visualization can help make the data easier to read or understand. There are visualization tools for search, music, networks, online communities, and almost anything else you can think of. Whether you want a desktop application or a web-based tool, there are many specific tools are available on the web that let you visualize all kinds of data. Here are some of the best:

Visualize Social Networks

Last.Forward: Thanks to Last.fm’s new widget gallery, you can now explore a wide selection of extras to extend your Last.fm experience. The gallery hosts widgets for your desktop, for the web, for social networks, and much more. One of the better tools in the gallery, last.forward, is open source software that lets you map out any last.fm user and their connections. The web site for the software appears to be in German, but the “Download” button still works. And once it was downloaded and installed, I had no trouble using it myself.

Last Forward

Friends Sociomap: Friends Sociomap is another Last.fm tools that generates a map of the music compatibility between you and your Last.fm friends.

Fidg’t: Fidg’t is a desktop application that gives you a way to view your networks tagging habits. You can see what kind of music your network is into, or what kind of pictures they are taking. The Fidg’t Visualizer allows you to play around with your network. To use Fidg’t, you interface with the Visualizer through Flickr and LastFM tags, using any tag to create what they call a “Magnet.” Once a Tag Magnet is created, members of the network will gravitate towards it if they have photos or music with that same Tag. You can also search through the network for certain users, and see their recent photos and music. The Fidg’t interface is beautiful, too.

Fidg’t

The Digg Tools:

Digg.com has some of the best web-based visualization tools on the net, so they’re a must for any visualization list.

  • Pics: Digg Pics is the latest tool that tracks the activity of images on the site with images that slide in from the left as people submit them and digg them.
  • Arc: Digg Arc displays stories, topics, and containers wrapped around a sphere. The more diggs, the thicker the arcs.
  • BigSpy: Digg BigSpy places stories at the top of the screen as they are dugg. Bigger stories have more diggs.
  • Stack: Digg Stack shows diggs in real time, with diggs falling from the top of the screen. As stories get more diggs, they’re shown in brighter colors.
  • Swarm: Digg Swarm draws circles for stories as they’re dugg. Diggers swarm around stories which makes them grow and get brighter.

One more: Digg Radar. Although this is an unofficial visual aid, Digg Radar is worth a look too. With Digg Radar, you wait and watch for buttons to appear on the map which indicate that a person has Dugg a story. Hover over the button to see their username. Click it to see details about the story, with links to the Digg page or directly to the article.

YouTube:

You can discover related videos using YouTube‘s visualizations. To use this feature, go to a YouTube video, click on the full-screen button, and then click on the small button that shows a network. You’ll see a lot of video balloons appear and the configuration will change when you hover over a button.

Visualize Music

  • Liveplasma and Musicovery let you discover new music.
  • Tuneglue music map is a “relationship explorer,” similar to LivePlasma. Using data from Amazon and Last.fm, Tuneglue explores relationships between musical artists.
  • Moody lets you tag your music collection with colors. They also have a color-coded web player. (our coverage)
  • The Echo Nest is an audio analysis tool which takes an mp3 file, breaks it up into little segments, and gives pitch, loudness, and high-level timbral descriptions of each one of those segments. The program maps a subset of this audio data onto a visual scale and creates video playback of the song. (more)
  • An interactive harmony model of music which geometrically describes relationships in harmony. The model can be a visualization tool for songwriters or students of music.
  • Musiclens gives music recommendations and presents your current mood and musical taste as a diagram.
  • Shape Of Song: What does music look like?
  • Musicmap: connections are represented as connected lines; they create a web.

Musicovery

Last.fm music visual tools:

  • Last Graph: Create artist wave graphs from your musical history in PDF and SVG format.
  • Extra Stats: Colorful Stats and tag clouds.

Visualize the Internet

  • Opte is a project that lets you graphically map the internet. The data represented and collected here serves a multitude o
    f purposes: Modeling the Internet, analyzing wasted IP space, IP space distribution, detecting the result of natural disasters, weather, war, and esthetics/art.
  • Akamai Technologies, who deliver 15-20% of all web traffic offered up some interesting tools last year for viewing their traffic data. (Our coverage) From their flagship app, the Real-time Web Monitor, which shows countries with the most traffic to the Network Performance Comparison app, Akami’s tools are an interesting way to see the web in real time. In all, they offer 6 Flash-based apps to the public.
  • Other internet traffic visualizations include the Internet Health Report and the Internet Traffic Report.
  • MantaRay displays the geographical placement of MBONE infrastructure (Multi-cast backbone) of the internet. Otter displays topological views of the (same) multicast infrastructure.
  • Packet Garden is an app that watches your Internet traffic and builds a private world that you can later explore.
  • Mapnet is a Java applet to visualize the topologies of backbones of major U.S. Internet Service Providers.
  • Websites as graphs. An HTML DOM Visualizer Applet, which displays sites as graphs depending on the amount of links, tables, div tags, images, forms and other tags.

Packet Garden

Amazon

  • LivePlasma: music discovery (see also music section of this list)
  • Flowser is another flash-based Amazon visualization for search.
  • BrowseGoods is a visualization that lets you zoom and pan Amazon’s catalog of products.
  • Tuneglue music map is a “relationship explorer,” similar to LivePlasma. Using data from Amazon and Last.fm, Tuneglue explores relationships between musical artists. (see also music section of this list)
  • Coverpop is more of an art project that lets you browse Amazon via a collage.
  • Amaztype, a typographic book search, collects the information from Amazon and presents it in the form of keyword you’ve provided. To get more information about a given book, simply click on it.

Flickr

  • Taglines lets you to visualize Flickr tags over time
  • Flickrvision: view real-time flickr photos on a map.
  • Flickrtime is a tool that uses Flickr API to present the uploaded images in real-time. The images form the clock which shows the current time.

Some details on these: see “Alternative ways to browse Amazon” (our coverage)

Miscellaneous

  • Visual Thesaurus: The Visual Thesaurus is an interactive dictionary and thesaurus which creates word maps that blossom with meanings and branch to related words.
  • Twittervision: view real-time tweets on a map.
  • 17 More Ways to Visualize Twitter
  • All the ways to visual del.icio.us collected here.
  • Three Views shows three views of the earth, in which each country is represented by a circle that shows the amount of money spent on the military (size of circle) and what fraction of the country’s earnings that uses (color).
  • We Feel Fine shows human feelings calculated from a large number of weblogs.
  • Interactive History Timeline presents the history of Great Britain, divided into interactive data blocks.
  • Winning Lotto Numbers shows the frequency of appearance of every number from one year to the next one.
  • Language Poster – the history of programming languages

Sites Dedicated to Visualization

Many Eyes

Search

Heatmaps:

Heatmaps site CrazyEgg applies heatmaps to tracking what visitors do on a user’s website. Their software captures user clicks on each page and then presents a summary in the form of a heatmap. Other heatmap sites include Feng-GUI and FuseStats. Summize applies heatmaps to shopping via their search engine(our coverage here, here and here).

Visualizing the Power Struggle in Wikipedia displays the most popular articles and the most frequent search queries in the heatmap.

Visual Search Engines:

  • Riya’s Like.com: first true visual search engine does visual search for shopping.
  • Searchme: upcoming visual search for the web
  • Xcavator: A photo search engine which utilizes visual clues that you provide to identify and extract similar pictures from large groups of digital images.
  • ManagedQ: A visual search experiment with some built-in semantics. (our coverage)
  • oSkope: Visual search engine for finding products that searches Amazon, Ebay, Flickr, Fotolia, Yahoo!Image Search and YouTube.
  • Quintura: visual search engine that uses clouds, tags, and highlighting.
  • Tafiti: Microsoft’s experimental visual search engine running on Silverlight.
  • Retrievr is an experimental service which lets you search and explore in a selection of Flickr images by drawing a rough sketch.
  • Mooter: Visual search engine that organizes results In clusters.
  • KartOO: visual web searc.
  • SearchCrystal is a search visualization tool that let you compare, remix and share results from sources on the web, whether sites, images, videos, blogs, news engines or RSS feeds. (see also KoolTorch)
  • Spacetime: search Google, YouTube, RSS, eBay, Amazon, Yahoo!, Flickr and images all in one 3D space.
  • grokker: web search or enterprise search offering map views of data.
  • Burst Labs suggests similar or connected items to your search queries in a bubble
  • UBrowser renders interactive web pages onto geometry using OpenGL and an embedded instance of Gecko
  • walk2web – enter a URL, then visually browse web sites linked from it
  • TouchGraph‘s Amazon Browser, Google Browser, and LiveJournal Browser

Touchgraph

News and RSS

  • Voyage is an RSS-feader which displays the latest news in the “gravity area”. News can be zoomed in and out. The navigation is possible with a timeline
  • Newsmap is an application that visually reflects the constantly changing landscape of the Google News news aggregator.
  • Universe DayLife displays events, connections and news as circles which gravitate around the topic they are related to.

Data

Swivel

some sources – via, via

页面:«12

无觅相关文章插件,快速提升流量