Anchor Text-链接锚文本

Wikipedia:The anchor text or link label is the visible, clickable text in a hyperlink. The words contained in the Anchor text can determine the ranking that page will receive by search engines.

Anchor Text 非常重要。通过一个简单实验,可以深刻理解这个重要性。
http://www.google.com/ 中搜索 “click here” ,我们发现,排在搜索结果第一页第一位的是http://www.adobe.com/ 的网页,下面几位是 http://www.xe.com/http://www.apple.com/http://www.microsoft.com/ 等(这几个的PR值均为9或10,过会儿去看看)。
这几个网站的页面内都不包含 “click here” 这个关键词,那为什么他们排到前几位去了呢?
原因是:为数众多的网页以 “click here” 为链接锚文本指向这几个网站。

现在去看看吧:Google

Lucene:基于Java的全文检索引擎简介

Lucene:基于Java的全文检索引擎简介
请点击查看原文。##CONTINUE##

Lucene是一个基于Java的全文索引工具包。
基于Java的全文索引引擎Lucene简介:关于作者和Lucene的历史
全文检索的实现:Luene全文索引和数据库索引的比较
中文切分词机制简介:基于词库和自动切分词算法的比较
具体的安装和使用简介:系统结构介绍和演示
Hacking Lucene:简化的查询分析器,删除的实现,定制的排序,应用接口的扩展
从Lucene我们还可以学到什么

参考资料:
Apache: Lucene Project
http://jakarta.apache.org/lucene/
Lucene开发/用户邮件列表归档
http://www.mail-archive.com/lucene-dev@jakarta.apache.org/
http://www.mail-archive.com/lucene-user@jakarta.apache.org/
The Lucene search engine: Powerful, flexible, and free
http://www.javaworld.com/javaworld/jw-09-2000/jw-0915-Lucene_p.html
Lucene Tutorial
http://www.darksleep.com/puff/lucene/lucene.html
Notes on distributed searching with Lucene
http://home.clara.net/markharwood/lucene/
中文语言的切分词
http://www.google.com/search?sourceid=navclient&hl=zh-CN&q=chinese+word+segment
搜索引擎工具介绍
http://searchtools.com/
Lucene作者Cutting的几篇论文和专利
http://lucene.sourceforge.net/publications.html
Lucene的.NET实现:dotLucene
http://sourceforge.net/projects/dotlucene/
Lucene作者Cutting的另外一个项目:基于Java的搜索引擎Nutch
http://www.nutch.org/
http://sourceforge.net/projects/nutch/
关于基于词表和N-Gram的切分词比较
http://china.nikkeibp.co.jp/cgi-bin/china/news/int/int200302100112.html
2005-01-08 Cutting在Pisa大学做的关于Lucene的讲座:非常详细的Lucene架构解说
特别感谢:前网易CTO许良杰(Jack Xu)给我的指导:是您将我带入了搜索引擎这个行业。

Nutch介绍[转自Nutch中文网站]

Nutch 是一个开源Java 实现的搜索引擎。它提供了我们运行自己的搜索引擎所需的全部工具。可以为什么我们需要建立自己的搜索引擎呢?毕竟我们已经有google可以使用。这里我列出3点原因: ##CONTINUE##

透明度:Nutch是开放源代码的,因此任何人都可以查看他的排序算法是如何工作的。商业的搜索引擎排序算法都是保密的,我们无法知道为什么搜索出来的排序结果是如何算出来的。更进一步,一些搜索引擎允许竞价排名,比如百度,这样的索引结果并不是和站点内容相关的。因此 Nutch 对学术搜索和政府类站点的搜索来说,是个好选择。因为一个公平的排序结果是非常重要的。
对搜索引擎的理解:我们并没有google的源代码,因此学习搜索引擎Nutch是个不错的选择。了解一个大型分布式的搜索引擎如何工作是一件让人很受益的事情。在写Nutch的过程中,从学院派和工业派借鉴了很多知识:比如:Nutch的核心部分目前已经被重新用 Map Reduce 实现了。看过开复演讲的人都知道 Map Reduce 的一点知识吧。Map Reduce 是一个分布式的处理模型,最先是从 Google 实验室提出来的。你也可以从下面获得更多的消息。

http://www.domolo.com/bbs/list.asp?boardid=29

http://domolo.oicp.net/bbs/list.asp?boardid=29

并且 Nutch 也吸引了很多研究者,他们非常乐于尝试新的搜索算法,因为对Nutch 来说,这是非常容易实现扩展的。
扩展性:你是不是不喜欢其他的搜索引擎展现结果的方式呢?那就用 Nutch 写你自己的搜索引擎吧。 Nutch 是非常灵活的:他可以被很好的客户订制并集成到你的应用程序中:使用Nutch 的插件机制,Nutch 可以作为一个搜索不同信息载体的搜索平台。当然,最简单的就是集成Nutch到你的站点,为你的用户提供搜索服务。
Nutch 的安装分为3个层次:基于本地文件系统,基于局域网,或者基于 internet 。不同的安装方式具有不同的特色。比如:索引一个本地文件系统相对于其他两个来说肯定是要稳定多了,因为没有 网络错误也不同缓存文件的拷贝。基于Internet 的搜索又是另一个极端:抓取数以千计的网页有很多技术问题需要解决:我们从哪些页面开始抓取?我们如何分配抓取工作?何时需要重新抓取?我们如何解决失效的链接,没有响应的站点和重复的内容?还有如何解决对大型数据的上百个并发访问?搭建这样一个搜索引擎是一笔不小的投资呀!在 ” Building Nutch: Open Source Search,” 的作者 Mike Cafarella 和 Doug Cutting 总结如下::

… 一个具有完全功能的搜索系统:1亿页面索引量,每秒2个并发索引,需要每月800美元。10亿页面索引量,每秒50个页面请求,大概需要每月30000美元。

这篇文章将为你演示如何在中等级别的网站上搭建Nutch。第一部分集中在抓取上。Nutch的抓取架构,如何运行一个抓取程序,理解这个抓取过程产生了什么。第二部分关注搜索。演示如何运行Nutch搜索程序。以及如何订制Nutch 。

Nutch Vs. Lucene

Nutch 是基于 Lucene的。Lucene为 Nutch 提供了文本索引和搜索的API。一个常见的问题是;我应该使用Lucene还是Nutch?最简单的回答是:如果你不需要抓取数据的话,应该使用Lucene。常见的应用场合是:你有数据源,需要为这些数据提供一个搜索页面。在这种情况下,最好的方式是直接从数据库中取出数据并用Lucene API建立索引。中文用户,可以参考 WebLucene 或者 车东 的一些列文章。如果需要中文分词帮助还可以联系作者。 http://domolo.oicp.net/bbs/list.asp?boardid=24 Erik Hatcher 和 Otis Gospodnetić’s 的 Lucene in Action 中详细讲述了这个过程。Nutch 适用于你无法直接获取数据库中的网站,或者比较分散的数据源的情况下使用。

架构

总体上Nutch可以分为2个部分:抓取部分和搜索部分。抓取程序抓取页面并把抓取回来的数据做成反向索引,搜索程序则对反向索引搜索回答用户的请求。抓取程序和搜索程序的接口是索引。两者都使用索引中的字段。()

实际上搜索程序和抓取程序可以分别位于不同的机器上。()

这里我们先看看Nutch的抓取部分。

抓取程序:

抓取程序是被Nutch的抓取工具驱动的。这是一组工具,用来建立和维护几个不同的数据结构: web database, a set of segments, and the index。下面我们逐个解释上面提到的3个不同的数据结构。

The web database, 或者WebDB, 是一个特殊存储数据结构,用来映像被抓取网站数据的结构和属性的集合。WebDB 用来存储从抓取开始(包括重新抓取)的所有网站结构数据和属性。WebDB 只是被 抓取程序使用,搜索程序并不使用它。WebDB 存储2种实体:页面 和 链接。页面 表示 网络上的一个网页,这个网页的Url作为标示被索引,同时建立一个对网页内容的MD5 哈希签名。跟网页相关的其它内容也被存储,包括:页面中的链接数量(外链接),页面抓取信息(在页面被重复抓取的情况下),还有表示页面级别的分数 score 。链接 表示从一个网页的链接到其它网页的链接。因此 WebDB 可以说是一个网络图,节点是页面,链接是边。

Segment 是 网页 的集合,并且它被索引。 Segment 的 Fetchlist 是抓取程序使用的 url 列表 , 它是从 WebDB中生成的。Fetcher 的输出数据是从 fetchlist 中抓取的网页。Fetcher 的输出数据先被反向索引,然后索引后的结果被存储在segment 中。 Segment 的生命周期是有限制的,当下一轮抓取开始后它就没有用了。默认的 重新抓取间隔是30天。因此删除超过这个时间期限的segment是可以的。而且也可以节省不少磁盘空间。Segment 的命名是 日期加时间 ,因此很直观的可以看出他们的存活周期。

索引库 是 反向索引所有系统中被抓取的页面,他并不直接从页面反向索引产生,它是合并很多小的 segment 的索引中产生的。Nutch 使用 Lucene 来建立索引,因此所有 Lucene 相关的工具 API 都用来建立索引库。需要说明的是 Lucene 的 segment 的概念 和 Nutch 的 segment 概念是完全不同的,不要混淆哦。 可以参考 车东 的相关文章。 www.chedong.com 简单来说 Lucene 的 segment 是 Lucene 索引库的一部分,而 Nutch 的 Segment 是 WebDB 中 被 抓取和索引的一部分。

Google Android手机—HTC Dream发布日期提前

摘自:InformationWeek

##CONTINUE##译文:

T-Mobile将开始销售HTC Dream—一款装配Google Android开源移动平台的智能手机。据华尔街日报的报道,这款手机已做好与苹果的IPHONE和RIM的黑莓进行抗衡的准备。

按以前的报道,HTC Dream将会在9月17日提供预订,但是华尔街日报却报道T-Mobile将在9月23日发布。Androd官方博客Stefan Frank贴了一份关于9月23日早上10点在纽约记者招待会的邀请函.

HTC Dream在八月份通过FCC(联邦通信委员会)批准。

该手机的详细配置还未经确认,但是一些如华尔街日报和非官方T-Mobile博客Tmonews等消息来源表明HTC Dream将支持3G网络,有5×3寸的触摸屏,轨迹球和一个侧滑或旋转QWERTY键盘。

HTC Dream将是市场上第一款装配Google Android软件的智能手机,更吸引人的是在Google Android平台上已经开发了各种第三方应用软件。

不像苹果,Google一直对那些想要为Google Android手机开发软件的开发者提供大力支持—Google发起Android开发挑战者计划,为最佳Google Android应用软件设立1000万美元的奖项。

首轮获奖软件包括可以让用户在他们手机上轻轻一点就能叫来出租车的Cab3me软件,和自动跟踪用户移动碳足迹的ECORIO软件。

原文:

The Google Channel

September 16, 2008

Dream On: T-Mobile To Unveil First Google Android Phone Soon

T-Mobile will be selling the HTC Dream, a smartphone that will run the Google Android open source mobile platform and is poised to take on the Apple iPhone and Research in Motion’s BlackBerry, in late October, according to an article by the Wall Street Journal.

Previous reports indicated the HTC Dream could be available for pre-order as early as Sept. 17, but the Wall Street Journal says that T-Mobile will announce the HTC Dream on Sept. 23. Stefan Frank of the blog Android Authority posted a screenshot of his invitation to a press conference on Sept. 23 at 10:30 a.m. in New York City.

The HTC Dream was approved by the FCC back in August.

Details have not been confirmed, but sources such as the Wall Street Journal and the unofficial T-Mobile blog Tmonews indicate that it’s likely the HTC Dream will run on the 3G network, and have a 5-inch by 3-inch touchscreen, a trackball and a sliding or swivel QWERTY keyboard.

The HTC Dream will be the first smartphone on the market equipped with Google Android software; interestingly, a variety of third party apps have already been created on the Google Android platform.

Unlike Apple, Google is throwing a lot of support behind developers who want to create apps for Google Android phones — Google is sponsoring the Android Developer challenge with a total of $10 million in prizes for the best Google Android apps. Winning apps from the first round of the contest include cab4me, which will allow users to order a cab with a single click on their phones, and Ecorio, which automatically tracks a user’s mobile carbon footprint.

Google免费天气预报短信

这个,我也是刚刚从网上看到的…拿出来分享一下

##CONTINUE##访问如下网址:  http://www.google.com/sms/alerts

##CONTINUE##然后选择你需要定制天气预报的城市,选择好后点击“免费订阅”弹出如下对话框:

接下来,自己会做了吧….

此天气预报接收的内容是中文短信提醒,接收到的天气预报也只有未来第二天的预报..8过也可以了..反正每天都有一次..且短信是在下午发送的

[ZZ]Google Picasa 3重大更新-增加人脸识别功能

Google10周年的确给我带来了惊喜,首先是推出Chrome浏览器,接着又公布了Picasa3,该版本相对上个版本更新很多。好像目前只支持英文版,中文版还停留在2.75版。

##CONTINUE##Picasa 3 下载

下面是来自Ars Technica的评测。

By David Chartier | Published: September 03, 2008 – 07:05PM CT

(点击上图打开原文)

译文:

Google昨日宣布了对其Picasa生态系统两大产品-桌面客户端和网络相册(在线图片发布和共享)-的重大更新。Ars Technica 花了些时间把家里的老照片擦掉灰尘,尝试了下所有这些新的编辑,发布以及识别功能。

Google对Picasa的更新分为两项,包括其桌面客户端和网络服务版,这两款产品如今结合的更加紧密,所以我们将尽量将他们分开分析并特别指出他们以怎样新方式进行结合。由于Picasa在很多方面主要基于其桌面版,所以我们先从桌面版谈起。

Picasa3 Windows PC版(请支持Mac OS X 操作系统)

Picasa3展现了一个精心改进新界面,其新功能和照片发布的整合上千字也很难讲清楚(不过这是我们这个文章要达到的数字)。遗憾的是Picasa3现在只支持Windows,Linux版本依然停留在2.7,Mac版本据说要在2008年底才能完成。Picasa的界面很紧密,只是在顶部工具栏的区域有一些小的按钮。Picasa甚至还采用了像Iphone操作系统的滑块按钮那样新的界面工具,主要用在同步相册或文件夹到Picasa网络相册等开关切换功能上(随后我们专门谈这个)。过去我用Picasa并不是很多,不过这次的变化的确值得称赞,令这个产品看上去更趋成熟。

或许Picasa3最具有意义的新功能就是新的”同步到网络”按钮,支持文件夹和相册同步。你想想,这样就意味着任何新照片,甚至是现存图片做的修改都会自动同步到Picasa网络相册(需要Google账户),Picasa网络相册的竞争对手是Flickr。就照片管理和分享来说,这个新功能自然而然的将Picasa以及其网络版产品变得更具吸引力,因为在线发布图片往往又麻烦又复杂。这样也无需再查找,安装和维护一些第三方上传工具,比如像Flickr,Photobucket和SmugMug这些在线服务就常常借助第三方上传软件。

在我们给文件夹添加新图片和使用各种新工具编辑(随后会谈到)的测试中,Picasa3的自动同步功能确实令人满意,图片更改后很快就会看到变化。然而我们进行大批量操作,比如把平时或者假期拍到的几百张照片放在一个共享文件夹里。如果你现在正在进行类似操作,请让大家分享你的经验。

Picasa3的另一个显著功能是一系列功能增强的编辑工具。用户现在可以使用更多工具进行裁剪,色彩校正和对比度调节,红眼处理,润色,添加文本等操作。Picasa的修改和效果面板也随着这些工具的改变而有所变动,他们的确提供了免费图片管理软件该具备的相当全面的功能。

Picasa3首先展现给客户的新特色功能是—Picasa现在可以代替Windows Exploer成为图片浏览器。默认可以支持大部分常用图片格式,包括JPG,RAW,TIF和BMP,如果你选择使用Picasa,Google也为实际用户添加了一些特殊格式.在Windows Explore双击一个图片,在桌面和任何应用程序上激活一个渐渐变暗的覆盖层,同时以平滑缩放的动态的形式显示一个无边缘的完整图片。在底部,也就是Windows任务栏的上面,是媒体播放条,显示着文件夹内其他图片的缩略图以及基本的视图控制条。

效果很漂亮(虽然对是否有必要存在争议),很容易让人联想到以前曾在一些网站和博客上看到的各种”光箱特效”Java脚本图片组。Picasa的图片浏览器相比Windows默认的图片浏览器加载速度并不慢,而且在我们测试过程中,常常给人感觉比Windows的图片浏览器更生动活泼。

Picasa3所有新特色的最后一个但并非不重要的功能是和Youtube整合的基本短片制作。用户可以将选出的一些图片制成一个简单的短片,可以选择大量的过渡模式以及各种分辨率(最大支持1808p),可以添加滚动文本,并支持音轨。除了单键发布到youtube外,也没有更多的东西可以让我不用设置快门。不过这些功能还是很不错的,至少会令母亲和家人度过一个愉快的假期。

Picasa网络相册人脸识别功能

Picasa3说完了,我们对Picasa网络相册的新功能彻底研究了一下,发现他们对其桌面版兄弟的体验赞赏有加,只是有时这种赞赏需要一些技巧。毫无疑问,令人印象最深的功能是”标签命名”,该功能利用人脸识别技术自动识别照片中的人物。这样可以很方便的自动更新相册中的堂兄弟Erin或者你父母的照片。但是这些操作必须在Picasa网络相册里进行。Picasa3桌面版并不能进行这些操作,好像桌面版在所有这些自动化技术和人脸识别标注上技术上没有得到任何优势.

Google的人脸标注技术在我们测试中有相当不错的表现,当我们从一组不同场合甚至不同光照环境下拍摄的照片里选筛选一个人的时候,该技术表现的十分出色。当然不是每次都有十分完美的表现,不过Google内置一些功能如普通标签命名以及与Gmail联系人自然的结合可以方便我们很快从一群人里选取名称。Google当然不是第一个在网络服务里提供这种人脸识别技术的(Facebook早些时候已经提供过更基础的一系列类似功能),但Google的服务是我们见过最好的一个.

如果你想试试标签命名这个功能,请务必到Picasa网络版设置里开启,因为缺省是关闭的。

对那些对自己照片比较谨慎的朋友来说,Picasa网络相册的提供的另一个特色功能是与创作共享授权的整合。默认的创作共享授权可以在设置里面设为允许重新编辑,允许商业使用,并规定相同方式共享的条款,用户当然可以选禁用所有这些选项。该授权也可以在侧边栏单独为每一张图片自定义。

除了耳目一新的新用户界面外Picasa网络相册也做了一系列的细微改进。最后要提的是一个允许用户搜询Picasa的公共图书馆和精选的精华摄影的浏览页面。虽然更希望看到能提供地理定位工具来进行更有趣的体验,但是热门标签也很受欢迎.

现在超过了我们当初预算的1000字,但是对Picasa的这次更新的确值得更多的介绍。Google在以其特有的潜力投入照片共享空间已有一段时间,现在已有2个独立分离的产品。很高兴看到其线下照片管理工具以他应该的方式同在线发布和共享服务进行整合.Flickr或许还不用过多忧虑,但是如果看到一群新的用户尝试更方便的新Picasa的话,那一点也不奇怪。

原文:

Google announced major updates yesterday to both components of its Picasa ecosystem: its desktop client and Web Albums, a service for displaying and sharing photos online. Ars Technica took some time to dust off the old family photos to see what the new editing, publishing, and identifying features are all about.

Google’s Picasa announcement is two-fold, involving a desktop client and web service that are now much more closely integrated, so we’ll try to cover them separately while pointing out the new ways in which they shake hands. Since Picasa hinges in many ways on its desktop client, though, we’ll start there.

Picasa 3 for your Windows PC (make a Mac OS X client please)

Sporting a much more refined overall UI, Picasa 3′s new features and publishing integration are easily worth more than 1,000 words (but that’s about where we’ll get in this piece). Unfortunately, Picasa 3 is only available for Windows right now, with the Linux version still at 2.7 and a Mac version only rumored to be on its way before 2008 is over. Picasa’s UI is tighter, with smaller buttons in some areas like its top toolbar, and it even adopts new UI tools like the button slider from the iPhone OS for toggling features like syncing an album or folder to Picasa Web Albums (more on that in a minute). We haven’t been heavy users of Picasa in the past, but the changes are definitely welcome and make the app feel more mature.

Probably the most significant new feature in Picasa 3 is a new “Sync to Web” switch that adorns any folder or album. As you can imagine, this allows any new photos, or even changes made to existing photos, to be automatically synchronized up to Picasa Web Albums (a free Google Account is required), Google’s Flickr competitor. This new feature by itself transforms Picasa and its web counterpart into a much more appealing solution for organizing and sharing photos, as it takes the complicated busywork out of publishing photos online. It also removes the need to find, install, and maintain some sort of third-party uploading tool, which is often the case with similar online services like Flickr, Photobucket, and SmugMug.

In our testing with adding new photos to a folder and making edits using various new tools (again, more in a moment), Picasa 3′s automatic sync feature worked really well and kicked in immediately after every change we made. We did not, however, perform any heavy lifting, such as dumping a couple hundred photos from a shoot or vacation into a shared folder. Let us know what your experience was like if you’re doing work like this.

Next on Picasa 3′s notable new features list is an enhanced set of editing tools. Users now have more tools for cropping, correcting color and contrast, fixing red-eye, retouching, adding text, and more. Naturally, these tools are accompanied by changes to Picasa’s fixes and effects panels, and they offer a fairly well-rounded set of features as far as free image organization tools go.

A unique new feature that Picasa 3 presents to users with a first-run wizard is the ability to take over Windows Explorer’s duties as an image viewer. A handful of basic supported image formats are checked by default, including .JPG, .RAW, .TIF, and .BMP, and Google added some special sauce to the actual viewer if you opt to use it. Double-clicking an image in Windows Explorer will evoke a dimming overlay over the desktop and any open applications, displaying a border-less full version of the photo with a slick scaling animation. A media strip sits below, just above the Windows task bar, containing thumbnails of other images in the folder and basic navigation controls.

The effect is nice (though arguably unnecessary) and reminiscent of various “Lightbox” JavaScript image packages you may have seen on some websites and blogs. Picasa’s image viewer doesn’t seem to take any longer to load when compared to Vista’s default of Windows Photo Gallery, and usually seemed to be a bit snappier in our testing.

Last but not least in Picasa 3′s repertoire of new tricks is a basic movie maker with YouTube integration. Users can group a selection of photos into a simple movie, pick from a handful of transitions and resolutions (all the way up to 1080p), add text slides, and apply an audio track. Besides one-click publishing to YouTube there isn’t much to get our shutters out of sync over, but it’s still nice to have features that are sure to make mom and the family happy come holiday season.

Picasa Web Albums does face recognition

With Picasa 3 out of the way, we gave some of Picasa Web Album’s new features a run-through to find that they compliment its desktop brethren’s experience pretty well, if not awkwardly at times. Easily the most impressive new feature is “Name Tags,” which harnesses facial recognition technology to automatically identify people in photos. This makes it easy to, say, keep a running album automatically updated with photos of Cousin Erin or your parents, but this all has to be done on Picasa Web Albums. Picasa 3 does none of this work on the desktop, and doesn’t appear to gain any of the advantages of all this automation and facial recognition tagging.

That said, Google’s facial tagging technology was pretty impressive in our testing, as it did reasonable well when picking the same individual out of a group of photos shot at different events and even in different lighting. It wasn’t perfect every time, of course, but Google built in features like buttons for commonly tagged names and, naturally, integration with one’s Gmail contact list for quickly picking names out of the crowd. Google certainly isn’t the first to offer facial recognition in a web service like this (Facebook has offered a much more basic set of similar features for some time now), but this is definitely one of the best implementations we’ve seen.

If you want to give this “name tags” feature a try, be sure to switch it on in your Picasa Web Albums Settings area; it isn’t enabled by default yet.

Another significant feature of Picasa Web Albums for anyone halfway serious abut their photography is the integration of Creative Commons licensing. A default CC license can be set in the Settings area to allow remixing, allow commercial use, and require share-alike terms, or users can, of course, opt to not allow any of these options. This license can also be customized for each photo in the sidebar.

Picasa Web Albums has received lots of other refinements, in addition to a refreshed UI of its own. Last on our mentionable list for now is a new Explore page that allows visitors to sift through Picasa’s public library and a hand-picked selection of prime photography. Popular tags are a welcome touch, though it would’ve been nice to see collocation tools offered here for an even more interesting experience.

Worth more than 1,000 words

Ok fine, so we went over our initial word budget, but this major Picasa update deserved it. Google has been squandering some unique potential in the photo sharing space for a while now with these two separate components, and it’s great to see its offline photo organization tool integrating the way it should with an online publishing and sharing service. Flickr probably doesn’t haven’t anything to worry about just yet, but we wouldn’t be surprised to see a new batch of users giving the convenient new Picasa ecosystem a try.

The Best Tools for Visualization[ZZ]

##CONTINUE##
介绍了以下可视化工具,这个是中文翻译(by 帕兰映像)的地址:http://parandroid.com/magic-number-of-100-visualization-technology-application/
Written by Sarah Perez / March 13, 2008 9:25 AM / 48 Comments« Prior Post Next Post »

Visualization is a technique to graphically represent sets of data. When data is large or abstract, visualization can help make the data easier to read or understand. There are visualization tools for search, music, networks, online communities, and almost anything else you can think of. Whether you want a desktop application or a web-based tool, there are many specific tools are available on the web that let you visualize all kinds of data. Here are some of the best:

Visualize Social Networks

Last.Forward: Thanks to Last.fm’s new widget gallery, you can now explore a wide selection of extras to extend your Last.fm experience. The gallery hosts widgets for your desktop, for the web, for social networks, and much more. One of the better tools in the gallery, last.forward, is open source software that lets you map out any last.fm user and their connections. The web site for the software appears to be in German, but the “Download” button still works. And once it was downloaded and installed, I had no trouble using it myself.

Last Forward

Friends Sociomap: Friends Sociomap is another Last.fm tools that generates a map of the music compatibility between you and your Last.fm friends.

Fidg’t: Fidg’t is a desktop application that gives you a way to view your networks tagging habits. You can see what kind of music your network is into, or what kind of pictures they are taking. The Fidg’t Visualizer allows you to play around with your network. To use Fidg’t, you interface with the Visualizer through Flickr and LastFM tags, using any tag to create what they call a “Magnet.” Once a Tag Magnet is created, members of the network will gravitate towards it if they have photos or music with that same Tag. You can also search through the network for certain users, and see their recent photos and music. The Fidg’t interface is beautiful, too.

Fidg’t

The Digg Tools:

Digg.com has some of the best web-based visualization tools on the net, so they’re a must for any visualization list.

  • Pics: Digg Pics is the latest tool that tracks the activity of images on the site with images that slide in from the left as people submit them and digg them.
  • Arc: Digg Arc displays stories, topics, and containers wrapped around a sphere. The more diggs, the thicker the arcs.
  • BigSpy: Digg BigSpy places stories at the top of the screen as they are dugg. Bigger stories have more diggs.
  • Stack: Digg Stack shows diggs in real time, with diggs falling from the top of the screen. As stories get more diggs, they’re shown in brighter colors.
  • Swarm: Digg Swarm draws circles for stories as they’re dugg. Diggers swarm around stories which makes them grow and get brighter.

One more: Digg Radar. Although this is an unofficial visual aid, Digg Radar is worth a look too. With Digg Radar, you wait and watch for buttons to appear on the map which indicate that a person has Dugg a story. Hover over the button to see their username. Click it to see details about the story, with links to the Digg page or directly to the article.

YouTube:

You can discover related videos using YouTube‘s visualizations. To use this feature, go to a YouTube video, click on the full-screen button, and then click on the small button that shows a network. You’ll see a lot of video balloons appear and the configuration will change when you hover over a button.

Visualize Music

  • Liveplasma and Musicovery let you discover new music.
  • Tuneglue music map is a “relationship explorer,” similar to LivePlasma. Using data from Amazon and Last.fm, Tuneglue explores relationships between musical artists.
  • Moody lets you tag your music collection with colors. They also have a color-coded web player. (our coverage)
  • The Echo Nest is an audio analysis tool which takes an mp3 file, breaks it up into little segments, and gives pitch, loudness, and high-level timbral descriptions of each one of those segments. The program maps a subset of this audio data onto a visual scale and creates video playback of the song. (more)
  • An interactive harmony model of music which geometrically describes relationships in harmony. The model can be a visualization tool for songwriters or students of music.
  • Musiclens gives music recommendations and presents your current mood and musical taste as a diagram.
  • Shape Of Song: What does music look like?
  • Musicmap: connections are represented as connected lines; they create a web.

Musicovery

Last.fm music visual tools:

  • Last Graph: Create artist wave graphs from your musical history in PDF and SVG format.
  • Extra Stats: Colorful Stats and tag clouds.

Visualize the Internet

  • Opte is a project that lets you graphically map the internet. The data represented and collected here serves a multitude o
    f purposes: Modeling the Internet, analyzing wasted IP space, IP space distribution, detecting the result of natural disasters, weather, war, and esthetics/art.
  • Akamai Technologies, who deliver 15-20% of all web traffic offered up some interesting tools last year for viewing their traffic data. (Our coverage) From their flagship app, the Real-time Web Monitor, which shows countries with the most traffic to the Network Performance Comparison app, Akami’s tools are an interesting way to see the web in real time. In all, they offer 6 Flash-based apps to the public.
  • Other internet traffic visualizations include the Internet Health Report and the Internet Traffic Report.
  • MantaRay displays the geographical placement of MBONE infrastructure (Multi-cast backbone) of the internet. Otter displays topological views of the (same) multicast infrastructure.
  • Packet Garden is an app that watches your Internet traffic and builds a private world that you can later explore.
  • Mapnet is a Java applet to visualize the topologies of backbones of major U.S. Internet Service Providers.
  • Websites as graphs. An HTML DOM Visualizer Applet, which displays sites as graphs depending on the amount of links, tables, div tags, images, forms and other tags.

Packet Garden

Amazon

  • LivePlasma: music discovery (see also music section of this list)
  • Flowser is another flash-based Amazon visualization for search.
  • BrowseGoods is a visualization that lets you zoom and pan Amazon’s catalog of products.
  • Tuneglue music map is a “relationship explorer,” similar to LivePlasma. Using data from Amazon and Last.fm, Tuneglue explores relationships between musical artists. (see also music section of this list)
  • Coverpop is more of an art project that lets you browse Amazon via a collage.
  • Amaztype, a typographic book search, collects the information from Amazon and presents it in the form of keyword you’ve provided. To get more information about a given book, simply click on it.

Flickr

  • Taglines lets you to visualize Flickr tags over time
  • Flickrvision: view real-time flickr photos on a map.
  • Flickrtime is a tool that uses Flickr API to present the uploaded images in real-time. The images form the clock which shows the current time.

Some details on these: see “Alternative ways to browse Amazon” (our coverage)

Miscellaneous

  • Visual Thesaurus: The Visual Thesaurus is an interactive dictionary and thesaurus which creates word maps that blossom with meanings and branch to related words.
  • Twittervision: view real-time tweets on a map.
  • 17 More Ways to Visualize Twitter
  • All the ways to visual del.icio.us collected here.
  • Three Views shows three views of the earth, in which each country is represented by a circle that shows the amount of money spent on the military (size of circle) and what fraction of the country’s earnings that uses (color).
  • We Feel Fine shows human feelings calculated from a large number of weblogs.
  • Interactive History Timeline presents the history of Great Britain, divided into interactive data blocks.
  • Winning Lotto Numbers shows the frequency of appearance of every number from one year to the next one.
  • Language Poster – the history of programming languages

Sites Dedicated to Visualization

Many Eyes

Search

Heatmaps:

Heatmaps site CrazyEgg applies heatmaps to tracking what visitors do on a user’s website. Their software captures user clicks on each page and then presents a summary in the form of a heatmap. Other heatmap sites include Feng-GUI and FuseStats. Summize applies heatmaps to shopping via their search engine(our coverage here, here and here).

Visualizing the Power Struggle in Wikipedia displays the most popular articles and the most frequent search queries in the heatmap.

Visual Search Engines:

  • Riya’s Like.com: first true visual search engine does visual search for shopping.
  • Searchme: upcoming visual search for the web
  • Xcavator: A photo search engine which utilizes visual clues that you provide to identify and extract similar pictures from large groups of digital images.
  • ManagedQ: A visual search experiment with some built-in semantics. (our coverage)
  • oSkope: Visual search engine for finding products that searches Amazon, Ebay, Flickr, Fotolia, Yahoo!Image Search and YouTube.
  • Quintura: visual search engine that uses clouds, tags, and highlighting.
  • Tafiti: Microsoft’s experimental visual search engine running on Silverlight.
  • Retrievr is an experimental service which lets you search and explore in a selection of Flickr images by drawing a rough sketch.
  • Mooter: Visual search engine that organizes results In clusters.
  • KartOO: visual web searc.
  • SearchCrystal is a search visualization tool that let you compare, remix and share results from sources on the web, whether sites, images, videos, blogs, news engines or RSS feeds. (see also KoolTorch)
  • Spacetime: search Google, YouTube, RSS, eBay, Amazon, Yahoo!, Flickr and images all in one 3D space.
  • grokker: web search or enterprise search offering map views of data.
  • Burst Labs suggests similar or connected items to your search queries in a bubble
  • UBrowser renders interactive web pages onto geometry using OpenGL and an embedded instance of Gecko
  • walk2web – enter a URL, then visually browse web sites linked from it
  • TouchGraph‘s Amazon Browser, Google Browser, and LiveJournal Browser

Touchgraph

News and RSS

  • Voyage is an RSS-feader which displays the latest news in the “gravity area”. News can be zoomed in and out. The navigation is possible with a timeline
  • Newsmap is an application that visually reflects the constantly changing landscape of the Google News news aggregator.
  • Universe DayLife displays events, connections and news as circles which gravitate around the topic they are related to.

Data

Swivel

some sources – via, via

中国人彪悍的一天

[ZZ from 88]

早上醒来,先用二甘醇超标的田七牙膏刷牙,再用发臭的蓝藻水洗脸,给儿子冲一瓶三鹿奶粉,自己喝杯黑作坊的豆浆,吃几个硫磺熏白了的馒头,或者是废纸箱当肉馅的包子,就点儿废旧油漆桶里腌的榨菜,包里放个安徽人05年的粽子(上班饿了吃),吃饱喝足,出门,深吸一口富含PX的空气,到断成两截的九江大桥上溜溜腿儿,找找白娘子和许相公在断桥的感觉。
中午跟同事一起到肯德基吃顿苏丹红炸鸡,喝了杯苯超标的可乐。下午给朋友打电话,就听见她哭哭啼啼,大约是炒股炒赔地,约她出来到新开的菜馆吃顿地沟油炒的菜,来一盘避孕药催大的香辣鳝鱼,再来一盘臭水沟捞来的麻辣龙虾,还有个农药高残留的清炒菠菜,老板上一杯重金属超标100倍的碧螺春茶,再喝点含甲醛的啤酒……算帐的时候168(太黑了,还不打折)吉利,老板又找回了一张假币。
回家的时候被宝马撞倒,太幸运!得让丫出点血,所以躺着不动,一睁眼,看宝马调回头来碾压……以史无前例的速度爬起来跑掉。
回家。要睡觉的时候,被刚装修完的赠品甲醛呛得眼泪直流,只好把脑袋蒙到hei_xin棉被子里。想起房子还有四十万贷款加利息,辗转反侧到天半亮都没眯。找安眠药吃了半瓶也没用,含在嘴里,哦,是糯米

奶粉,三聚氰胺

奥运行将闭幕,而一座冰山却慢慢展现在我们面前,一种似乎离我们很远的化学物质正蚕食着生命。最近人民的生活已慢慢从奥运中冷静下来,看看周围,似乎除了奥运让网络更开放之外,都是让人心凉的消息。房价开始下跌,对于我们广大的空军算是好消息,可更深远的次贷危机已经波及到中国。美国四大投行之一的雷曼兄弟挂了,马上反映在了股市,建行、工行半天不到就跌停。而沪市已经跌破2000点,真是飞流直下啊。
由于囊中羞涩,最近已从鲜奶转头奶粉,可刚喝下第一杯没几天,又爆出了个三鹿奶粉事件。而且,随着调查的展开,背后的隐情让人越来越不安。今天晚上央视新闻联播刚刚播出:

质检总局通报全国婴幼儿奶粉三聚氰胺含量抽检结果,河北三鹿、山西雅士利、内蒙古伊利、蒙牛集团、青岛圣源、上海熊猫、山西古城、江西光明乳业英雄牌、宝鸡惠民、多加多乳业、湖南南山等22个厂家69批次产品中检出三聚氰胺,被要求立即下架。

##CONTINUE##
一看幸好买的还是雀巢的,而且是成人奶粉。可是一细想就了解了,婴儿奶粉的奶源只可能比成人的好,牛奶厂商对婴儿奶粉的关注程度肯定比成人的高。那点心头的侥幸马上又扑灭了,心有不甘,又google了一把雀巢和三聚氰胺的关系,找到了下面的声明,同时也找到了雀巢转基因奶粉和碘含量超标的旧闻。。。
奶粉如此,如果是奶源的问题,鲜奶被查出问题也就是时间问题了。像奥运赞助商伊利这样的公司都被查出问题,随着检查力度的加大,被通报的企业肯定越来越多。美好的想法,大家都不要作假,都高质量的生活,何尝不可呢?是谁发现了这种化学物质的”妙用“?一种小小的化学物,毒害了那么多幼小的声明。。。是它毒害的吗?是人,最贪婪的是人的心。
ps:雀巢网站上的声明,但愿如此吧
==============================================

关于三鹿婴儿配方奶粉产品受到三聚氰胺污染事件的声明
雀巢完全理解婴儿父母们的担忧,并深切同情受此巨大不幸事件影响的每一个家庭。任何婴儿的死亡都是令人痛惜的生命丧失的悲剧。
作为一个在中国领先而负责任的食品公司,雀巢公司可以向所有的消费者和客户们保证:我们的一切婴儿配方和各种奶粉产品都从未受到三聚氰胺的污染。我们的产品无论是在生产过程中还是在生产完成后,都要经过不断的品质检验以确保其最高水平的产品质量和安全度。因此,雀巢公司请消费者和客户们放心,食用雀巢产品是完全安全的。
雀巢在原材料采购和生产流程的各个层面采用极其严谨的质量控制措施和风险评估系统。我们在生产的全过程中严格禁止三聚氰胺的使用。雀巢一直完全遵守国家及国际上的各项法规。
产品的安全和质量始终是雀巢公司不容妥协和最首要的关注重点。雀巢公司以积极主动而迅速的行动确保消费者的食品安全。
2008年9月15日

拜访琳姑娘

下午终于出了下太阳,也就兴冲冲的拜访艺术家去了。虽然坐车不是很顺利,但经过探索,也刚好赶到雨下大之前到站了,这老天真是折腾人。景德镇的生活让琳姑娘充满了干劲,虽然是自由工作者,却一点都不清闲,小日子过的不错据说。交换了月饼,拿求是饼换来了一个陶瓷月饼,希望早点升值,哈哈哈。下午电话hbn,说要吃蟹,马上让我有了注意,我们也弄了蟹子回来一顿清蒸,啃了起来。其实,吃蟹子,舌头很累的。。。-_-!!

艺术家的生活简单又充实,不详一般人想像的那么梦幻,甚至看起来也是枯燥的,不过当一件件作品呈现出来的时候,他们是欣慰的。下午,琳姑娘给演示了高科技作画,经多次要求,终于rob成功,哈哈。只是她无心的画了几下演示一下工具的用法,我也就算完成任务了。凭她的实例,升值应该很快,如需高价索取,请偷偷联系,嘿嘿。中秋假期结束~~