- #1
adohertyd
- 1
- 0
I am using a Markov Chain to get the 10 best search results from the union of 3 different search engines. The top 10 results are taken from each engine to form a set of 30 results.
The chain starts at State x, a uniform distribution of set S = {1,2,3,...30}. If the current state is page i, select page j uniformly from the union of the results from each search engine. If the rank of j < rank of i in 2 of the 3 engines that rank both i and j, move to j. Else, remain at i.
I understand the above no problem. The sorting point is where I am stuck however. The paper I am using that explains this says:
This is known as a Markov process, where the transition matrix P has P(i, j) = [itex]\frac{1}{n}[/itex] if a majority of the input rankings prefer j to i, and P(i, i) = 1−Ʃj≠i P(i, j). Under certain conditions, this process has a unique (up to scalar multiples) limiting distribution x that satises x = xP, where x(i) gives the fraction of time the process spends at element i. Dwork et al. propose sorting the elements by non-increasing x(i) values. To ensure that the process has a unique limiting distribution x, we use a "random jump": with probability [itex]\delta[/itex] > 0, we will choose a random element and move to this element (regardless of whether this element is preferred to the current element). In our experiments we have used [itex]\delta[/itex]= [itex]\frac{1}{7}[/itex] , which is the value of [itex]\delta[/itex] that is often chosen in the literature for PageRank implementations.
Could someone please explain this to me in plain english because I am completely lost with it at this stage.
This paper can be found http://www.siam.org/proceedings/alenex/2009/alx09_004_schalekampf.pdf with this specific part on page 43.
The chain starts at State x, a uniform distribution of set S = {1,2,3,...30}. If the current state is page i, select page j uniformly from the union of the results from each search engine. If the rank of j < rank of i in 2 of the 3 engines that rank both i and j, move to j. Else, remain at i.
I understand the above no problem. The sorting point is where I am stuck however. The paper I am using that explains this says:
This is known as a Markov process, where the transition matrix P has P(i, j) = [itex]\frac{1}{n}[/itex] if a majority of the input rankings prefer j to i, and P(i, i) = 1−Ʃj≠i P(i, j). Under certain conditions, this process has a unique (up to scalar multiples) limiting distribution x that satises x = xP, where x(i) gives the fraction of time the process spends at element i. Dwork et al. propose sorting the elements by non-increasing x(i) values. To ensure that the process has a unique limiting distribution x, we use a "random jump": with probability [itex]\delta[/itex] > 0, we will choose a random element and move to this element (regardless of whether this element is preferred to the current element). In our experiments we have used [itex]\delta[/itex]= [itex]\frac{1}{7}[/itex] , which is the value of [itex]\delta[/itex] that is often chosen in the literature for PageRank implementations.
Could someone please explain this to me in plain english because I am completely lost with it at this stage.
This paper can be found http://www.siam.org/proceedings/alenex/2009/alx09_004_schalekampf.pdf with this specific part on page 43.