Tribhuvan University
Institute of Science and Technology
Ir Model Qn
Bachelor Level / elective-iii-semester / Science
Computer Science and Information Technology( CSC413 )
Information Retrieval
Full Marks: 60 + 20 + 20
Pass Marks: 24 + 8 + 8
Time: 3 Hours
Candidates are required to give their answers in their own words as far as practicable.
The figures in the margin indicate full marks.
SECTION A
Attempt any TWO question.
What is the role of information in human life? How information is retrieved? Explain the architecture.
Define the role of text shingling. Given the following training dataset apply Rocchio algorithm to classify the document “process scheduling”. Here the two classes are “operating system” and “Automata”.
operating system :- disk scheduling
operating system :- process management
Automata :- process transition
Automata :- context free grammar
How rules are defined in Porter stemmer? Given the following documents and query rank the documents.
doc1 = “finite state machine”
doc2 = “transition machine”
doc3 = “transition state”
query = “state machine”
SECTION B
Attempt any EIGHT question.
How inverted index is created? Explain with a suitable example.
What is the effect on precision and recall in evaluating ranked documents? Illustrate with an example.
Explain the working mechanism of search engine.
Why do we need to expand query? How it is performed using relevance feedback?
Differentiate between collaborative and content based recommendation system.
How questions and answers are processed in QA system? Explain.
Describe the significance of latent semantic indexing and singular value decomposition
How spider works? Write the algorithm.
Define stop words. How do you stem the word?