|
Plenary Lecture
Using Some Web Content Mining Techniques to Extract Arabic Text from the Web
Documents
Assistant Professor Zakaria
Suliman Zubi
Computer Science Department
Al-Tahadi University
Serit Post Office, P. O. Box 727
Serit, Libya
E-mail: zszubi@yahoo.com
Abstract: With the massive collection of huge volumes of information
that are available on the World Wide Web these days and the immanent need
for new tools and techniques to analyze these information and transform it
into useful knowledge has been a strong revival of web mining research. Web
mining is one of the most important issues in data mining as well as other
information process techniques to the World Wide Web to discover useful
patterns. People can take benefits of these patterns to access the World
Wide Web more efficiently. Web mining in particular are divided into three
main categories such as content mining, usage mining, and structure mining.
In this paper we are going to apply web content mining to extract
non-English language knowledge from the Web. It requires some investigation
and evaluation on all possible methods in which web mining systems have to
deal with issues in language-specific text processing. We will use an Arabic
language-independent algorithm as a machine learning system. The algorithm
will use a set of features as a vector of keywords for the learning process
to apply text classification and clustering for the system. However, the
algorithms usually depend on some phrase segmentation and extraction
programs to generate a set of features or keywords to represent web
documents. We will indicate some general aspects for mining the Arabic text
on the web documents as well.
Brief Biography of the Speaker:
Zakaria Suliman Zubi was born in Benghazi Libya, in 1969. He received his
Ph.D. in Computer Science in 2002 from Debrecen University in Hungary,
before that he received his M.Sc. in Computer Science (Artificial
Intelligent), in 1998. He started his academic journey with a B.Sc. Degree
in Computer science in1993. He joined the Department of Computer Science,
Faculty of Science, Altahadi University, in 2003, where he became an
Assistant Professor since 2006. Dr. Zubi, served the university under
various administrative positions including the Head of Computer Science
Department 2003-2005, the postgraduate study coordinator in Computer Science
Department till now and the postgraduate study coordinator for the Faculty
of Science for one academic year 2004-2005. He is also an undergraduate and
postgraduate lecturer in the computer science department.
He is a reviewer of many scientific local journals in Libya, a member of the
Association for Computing Machinery society (ACM), a member of the Word
Scientific and Engineering Academy and Society (WSEAS), a member of the
Libyan Artificial Intelligent Association (LAIA), a member of the Libyan
Quality Assurance in Higher Education (LQAHE) and in the Benchmark team. He
is also an external and internal member of many postgraduate examination
committee boards in Libyan universities, and an official member of the main
committee board of the lecturer promotions at his University. His area of
research includes: Distributed Database, Web mining, Distributed Database
mining, Knowledge Discovery on Remote Databases, Remote Query Optimization,
Queue Strategies on Local Network, Operating System, Deadlocks in Operating
Systems, and Network and Distributed Database Security. He published as
authors and coauthors many researches and technical reports in local and
international journals and conference proceedings. His hobbies are playing
chess, swimming, and listening to music.
|
|
|