spacer
spacer Main Page
spacer
Call For Papers
spacer
spacer Location
spacer
spacer Chair-Committee
spacer
spacer Deadlines
spacer
spacer Paper Format
spacer
spacer Fees
spacer
spacer SUBMIT A PAPER
spacer
spacer SUBMIT A SPECIAL SESSION
spacer
spacer SEND THE FINAL VERSION
spacer
spacer Conference Program
spacer
spacer Presentation Information
spacer
spacer Call for Collaborators
spacer
spacer Relevant WSEAS Conferences
spacer
spacer REVIEWERS
spacer
spacer CONTACT US
Past Conferences Reports
Find here full report from previous events


Impressions from previous conferences ...
Read your feedback...


History of the WSEAS conferences ...
List of previous WSEAS Conferences...


Urgent News ...
Learn the recent news of the WSEAS ...

 



 

spacer

Plenary Lecture

Using Some Web Content Mining Techniques to Extract Arabic Text from the Web Documents


Assistant Professor Zakaria Suliman Zubi
Computer Science Department
Al-Tahadi University
Serit Post Office, P. O. Box 727
Serit, Libya
E-mail: zszubi@yahoo.com


Abstract: With the massive collection of huge volumes of information that are available on the World Wide Web these days and the immanent need for new tools and techniques to analyze these information and transform it into useful knowledge has been a strong revival of web mining research. Web mining is one of the most important issues in data mining as well as other information process techniques to the World Wide Web to discover useful patterns. People can take benefits of these patterns to access the World Wide Web more efficiently. Web mining in particular are divided into three main categories such as content mining, usage mining, and structure mining.
In this paper we are going to apply web content mining to extract non-English language knowledge from the Web. It requires some investigation and evaluation on all possible methods in which web mining systems have to deal with issues in language-specific text processing. We will use an Arabic language-independent algorithm as a machine learning system. The algorithm will use a set of features as a vector of keywords for the learning process to apply text classification and clustering for the system. However, the algorithms usually depend on some phrase segmentation and extraction programs to generate a set of features or keywords to represent web documents. We will indicate some general aspects for mining the Arabic text on the web documents as well.

Brief Biography of the Speaker:
Zakaria Suliman Zubi was born in Benghazi Libya, in 1969. He received his Ph.D. in Computer Science in 2002 from Debrecen University in Hungary, before that he received his M.Sc. in Computer Science (Artificial Intelligent), in 1998. He started his academic journey with a B.Sc. Degree in Computer science in1993. He joined the Department of Computer Science, Faculty of Science, Altahadi University, in 2003, where he became an Assistant Professor since 2006. Dr. Zubi, served the university under various administrative positions including the Head of Computer Science Department 2003-2005, the postgraduate study coordinator in Computer Science Department till now and the postgraduate study coordinator for the Faculty of Science for one academic year 2004-2005. He is also an undergraduate and postgraduate lecturer in the computer science department.
He is a reviewer of many scientific local journals in Libya, a member of the Association for Computing Machinery society (ACM), a member of the Word Scientific and Engineering Academy and Society (WSEAS), a member of the Libyan Artificial Intelligent Association (LAIA), a member of the Libyan Quality Assurance in Higher Education (LQAHE) and in the Benchmark team. He is also an external and internal member of many postgraduate examination committee boards in Libyan universities, and an official member of the main committee board of the lecturer promotions at his University. His area of research includes: Distributed Database, Web mining, Distributed Database mining, Knowledge Discovery on Remote Databases, Remote Query Optimization, Queue Strategies on Local Network, Operating System, Deadlocks in Operating Systems, and Network and Distributed Database Security. He published as authors and coauthors many researches and technical reports in local and international journals and conference proceedings. His hobbies are playing chess, swimming, and listening to music.


 
Copyright © www.wseas.org                        Designed by WSEAS