Hyperlink-Induced Topic Search (HITS) is a link analysis algorithm
which helps in rating Web pages also known as Hubs and authorities
and is developed by Jon Kleinberg. It was a precursor to PageRank.
The idea behind Hubs and Authorities stemmed from a particular
insight into the creation of web pages when the Internet was
originally forming; that is, certain web pages, known as hubs, served
as large directories that were not actually authoritative in the
information that it held, but were used as compilations of a broad
catalog of information that led users directly to other authoritative
pages. In other words, a good hub represented a page that pointed to
many other pages, and a good authority represented a page that was
linked by many different hubs
It conclude two main values for a page:
1. Page authority, which estimates the value of the content of the
page.
2. Page hub value, which estimates the value of its links to other pages.
2. Page hub value, which estimates the value of its links to other pages.
First it retrieves the set of results to the search query so that
the computation is performed only on this result set and not across
all Web pages.
The algorithm performs a series of iterations, each consisting of
two basic steps:
Authority Update: Update every node's Authority score to be equal
to the sum of the Hub Score's of every node that points to it. That
is, a node is given a high authority score by being linked to by
pages that are recognized as Hubs for information.
Hub Update: Update every node's Hub Score to be equal to the sum of the Authority Score's of every node that it points to. That is, a node is given a high hub score by linking to nodes that are considered to be authorities on the subject.
The Hub score and Authority score for a node are defined with the
following algorithm:
1. Start with every node having a hub score and authority score of
1.
2. Run the Authority Update Rule
3. Run the Hub Update Rule
4. Normalize the values by dividing every Hub score by the sum of the squares of all Hub scores, and dividing each Authority score by the sum of the squares of all Authority scores.
5. Repeat from the second step as necessary.
2. Run the Authority Update Rule
3. Run the Hub Update Rule
4. Normalize the values by dividing every Hub score by the sum of the squares of all Hub scores, and dividing each Authority score by the sum of the squares of all Authority scores.
5. Repeat from the second step as necessary.
A Comparison between Page Rank and Hyperlink Induced Topic Search (HITS) Algorithms
Hyperlink Induced Topic Search:
- HITS is based on two quality values of “Authority Update” and “Hub Update”. Authority update is calculated by the number of hub links connected with the authority website and Hub update is calculated by the number of authority websites connected by the Hub website. HITS overall result will be based on the connection between these two values. It actually calculates two scores per document.
- HITS operates on small sub graphs representing a linkage between Hub and Authority websites.
- In HITS, increase in the authority weight increases the hub weight of the sites.
- HITS calculate score without indexing.
- HITS has a special use in websites relational analysis specifically.
Page Rank:
- Page Rank is based on number of different factors especially number of quality back links. Quality back links are those links which are relevant to the niche of the website and are placed on high page rank websites. So Page Rank calculates mainly one score per document.
- Page Rank operates on a big web Graph focusing on all the back links and relevance factors.
- In Page Rank, quality back link on high PR website increases the page rank of the website.
- Page Rank calculates score after indexing process.
- Page Rank can be used for multiple factors like Street rank (ranking of places other than websites on the basis of population visits). Similarly Page rank is used in multiple environments from institutes to search engine crawlers.
Both HITS
and Page Rank have their plus point and benefits and both can be applied
in different scenarios. Page Rank is more popular because it can be
utilized in multiple environments other then web search. HITS is very
useful because of its special focus on Hub and authority websites
categorization.
Great article! Do you have more references? Thanks!
ReplyDeleteNice Article !!
ReplyDeleteWow! this is Amazing! I got free 100$ Gift cards! Click here to get your free gift cards!
ReplyDelete