Discussion:
Automatically extracted Ant FAQs
Stefan Henß
2011-03-07 12:14:47 UTC
Permalink
Hi everybody,

I'm currently doing research for my bachelor thesis on how to
automatically extract FAQs from unstructured data.

For this I've built a system automatically performing the following:
- Load thousands of conversations from forums and mailing lists (don't
mind the categories there, don't discriminate between sources).
- Build new categorization solely based on the conversation's texts (by
clustering).
- Pick the best modelled categories as basis for one FAQ each.
- For each question (first entry in a thread) find the best reply from
its answers.
- Select the most relevant and well formatted question/answer-pairs for
each FAQ.

For the evaluation I'm interested in expert's perceptions of the
results, e.g. if the questions are relevant, correctly answered, etc.
Also as I'll release a paper about the approach I'd be happy if you
could rate one or two questions (stars on the details pages) so I'd have
some statistics to present.


Here's the direct link to the Ant FAQs:
http://faqcluster.com/ant-java-build-task

(There are some other interesting FAQs as well at http://faqcluster.com/)


Thanks for your help

Stefan
Stefan Bodewig
2011-03-08 10:57:46 UTC
Permalink
Post by Stefan Henß
I'm currently doing research for my bachelor thesis on how to
automatically extract FAQs from unstructured data.
Interesting approach.
Post by Stefan Henß
For the evaluation I'm interested in expert's perceptions of the
results, e.g. if the questions are relevant, correctly answered, etc.
For most of the questions I've seen I wouldn't have expected them to
come out as "frequent". Some of them look misplaced (the where do I
donwload/install specific tasks questions are in the jar category for
example).

Of the answers I've checked most of them seemed to be the correct one
(if there is a correct answer at all, some questions may have more than
one correct answer, others none at all). Many answers are by Matt or
Jan, so they are correct 8-)

Stefan
J***@rzf.fin-nrw.de
2011-03-16 15:11:18 UTC
Permalink
My few ct's:

I am not sure if the question about Maven 3 ist correct in the Ant group.
http://faqcluster.com/question1764829863

The "question" is just one link.
http://faqcluster.com/question1772438546

Not sure if this Tomcat question is right in the Ant group.
http://faqcluster.com/question484708099

In this discussion, the required information is lost
http://faqcluster.com/question-1804157241

According to the start page the Task group should contain 43 entries. I can only see 15 (IE 8).
http://faqcluster.com/ant-task-script


Jan

________________________________

Von: Stefan Bodewig [mailto:***@apache.org]
Gesendet: Di 08.03.2011 11:57
An: ***@ant.apache.org
Betreff: Re: Automatically extracted Ant FAQs
Post by Stefan Henß
I'm currently doing research for my bachelor thesis on how to
automatically extract FAQs from unstructured data.
Interesting approach.
Post by Stefan Henß
For the evaluation I'm interested in expert's perceptions of the
results, e.g. if the questions are relevant, correctly answered, etc.
For most of the questions I've seen I wouldn't have expected them to
come out as "frequent". Some of them look misplaced (the where do I
donwload/install specific tasks questions are in the jar category for
example).

Of the answers I've checked most of them seemed to be the correct one
(if there is a correct answer at all, some questions may have more than
one correct answer, others none at all). Many answers are by Matt or
Jan, so they are correct 8-)

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: user-***@ant.apache.org
For additional commands, e-mail: user-***@ant.apache.org

Loading...