conserving linguistic heritage the foss way
TRANSCRIPT
Conserving Linguistic
Heritage the FOSS way...
Hello!I am Omshivaprakash
I’m a Bengaluru based Wikimedian and a FOSS contributor.
I’m here to share my experience helping reuse/conserve the linguistic heritage of Kannada the FOSS way!
2013-14Vachana
Sanchaya
11th and 12th Century literature & the need of the hour...
‘’We need to be able to research on Vachana Sahitya. We should be able to search Vachana’s on the NET.We need data to understand Sahitya much better.- Sri OL Nagabhushana Swamy- Sri Vasudendra
Challenges
▣ ANSI Data available on GoK Website ▣ GOK website not being intuitive▣ 15 large volumes Printed Books + others▣ No real tool to analyze the data at fingertips▣ Hot discussions on public forums needed
concordance & numerical data to debate on literature
Researches wanted data authentically come to consensus via research… but how?
Digitize in UnicodeIdea was to get hands on the digitized data in
a reusable format & in Unicode
ScrapeWe found that the data was available in digital format on GoK website http://vachanasahitya.gov.in
but in ANSI format.
We pulled the data with wget and write a python script to systematically extract data and converted the text to Unicode.
ALL IN FLAT FILES
Getting to work on data
But...It was not really enough. How does anyone take all the text in files and do research?We proposed to push this to a database and provide simple GUI tools to search text to look at results.
more challenges...
Technical difficulties
Providing the end results to large number of people.
Making them understand to use the tools such as MySQL WorkBench/ SQLite Manager etc...
Awareness
Text input methods
SQL syntax
OS compatibility
Expanding scope
What about other research requirements?
How many queries we can write and keep sharing with the linguists not the computer savvy people?
An opportunity to build something
For language that is close to our heart with few like minded people around over a cup of coffee, during weekends, whenever we have sometime to scribble through the need of our people…
IT WAS FUN...
We builtVachana Sanchaya
http://vachana.sanchaya.net
Portal for linguistic research
Visualization, Discussion board, Concordance & more...
Enable everyone
studentsResearchers Common Man
To unearth the wealth of literature
▣ by reading and searching through 21 thousand Vachana’s
▣ written by 250 Vachanakaara’s▣ Researching in finger tips via Concordance &
quick visualizations ▣ Building corpus of 2lac+ unique words ▣ Building biodata of all male & female
vachanakaaras▣ enabling crowd sourced review solution▣ opening up new possibilities for Linguistic
research across other literary work of Kannada.
We reached masses across the world...
FOSS
All because of the FOSS tools around us and its philosophy
that we believed in...
Rails, Nginx, Passenger, Memcached, MySQL, Python, Gitlab, wordpress & more...Only server cost to keep it running
Localized& being adopted to other projects too...
It is being reviewedto be contributed to Wiki Source & Wikipedia
Moving forward
Bring more literary works online
Standardize Research platform for language
Create timeline for Centuries of Heritage
How we are planning to do this?
CollaborationEnable community collaboration to build research documents around our literary heritage
EngageEngage students and others to work together on our code to build robust and futuristic tools for all type of literary works(Text, Poems, Old Kannada) etc
EvolveEvolve over period of time, adopt learnings from mistakes, reviews and feedbacks
Consult with communitiesWe would like to consult and learn from multiple language communities. Because Vachana Sahitya is translated to more than 15 languages & more
Keep tweakingWe keep working on tweaking the tool and make it robust to be used as a platform for our upcoming projects
Reaching goalsWe are determined to reach our goal of building unified search tool with timeline for centuries of Kannada Literature the FOSS way...
We are on Social Media - FB/Twitter/Google+
Embed us on Wordpress via Plugin
We will be on Mobile Soon…
We are opening up APIs to reuse data or build tools around Kannada literature
Adding English and other translated works too....
There is lot more to share
So, Keep in touch!!!
Our TeamPavithra, Myself, OLN, Vasudendra, Devaraj
Thanks!Any questions?
You can find me at:Kn/En Wiki: User:OmshivaprakashProject Page: http://vachana.sanchaya.netMain Project: http://kannada.sanchaya.net @omshivaprakash | @vachanasanchaya
Credits
Special thanks to all the people who made and released these awesome resources for free:▣ Team photo by Amit Mrugvadhe▣ To my team for having made this possible▣ Minicons by Webalys▣ Presentation template by SlidesCarnival▣ Photographs by Unsplash