using nosql databases to store radius and syslog data
DESCRIPTION
A seminar presentation done for TUT's NoSQL course. A brief look into the possibility and the feasibility of using NoSQL databases to store RADIUS accounting and Syslog data. In this particular case, Syslog-NG, Radiator RADIUS server and MongoDB were used as trial platforms. The presentation includes configuration examples and also some code.TRANSCRIPT
![Page 1: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/1.jpg)
Using NoSQL databases to store RADIUS and Syslog
data, part 1: IdeaKarri Huhtanen
18.9.2012
![Page 2: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/2.jpg)
Some background
• currently RADIUS accounting data is stored usually in SQL databases with fixed database schema
• for Syslog messages an SQL database can be used, but commercial log analyzers (like Splunk) usually use their own solutions which may or may not be SQL databases
• Started thinking if NoSQL database could be applied to both or one of these?
![Page 3: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/3.jpg)
RADIUS accounting messageWed Aug 8 13:49:33 2012 User-‐Name = "jotain@realm" NAS-‐Port = 8 NAS-‐IP-‐Address = 192.168.229.131 Framed-‐IP-‐Address = 192.168.163.226 NAS-‐Identifier = "Cisco_66:77:88" Airespace-‐WLAN-‐Id = 4 Acct-‐Session-‐Id = "50223ea9/00:11:22:33:44:55/2292" Acct-‐Authentic = Remote Tunnel-‐Type = 0:VLAN Tunnel-‐Medium-‐Type = 0:802 Tunnel-‐Private-‐Group-‐ID = 0:222 Event-‐Timestamp = 1344422780 Acct-‐Status-‐Type = Alive Acct-‐Input-‐Octets = 1262012 Acct-‐Input-‐Gigawords = 0 Acct-‐Output-‐Octets = 13518133 Acct-‐Output-‐Gigawords = 0 Acct-‐Input-‐Packets = 11692 Acct-‐Output-‐Packets = 11154 Acct-‐Session-‐Time = 1235 Acct-‐Delay-‐Time = 19 Calling-‐Station-‐Id = "00:11:22:33:44:55" Called-‐Station-‐Id = "f4:7f:35:5e:bf:b0" cisco-‐avpair = "nas-‐update=true" Digest-‐Response = "P"C<188>" Digest-‐Response = "P"C<194>" Timestamp = 1344422954
One message contains undetermined number of attributes.
Some can be interpreted, some stay unknown.
Because there can be a changing number of changing type of attributes I began to wonder if NoSQL could be used for storing these?
interpreted attributes,the unknown attributes are usually left in OID:FieldDataType binary format
![Page 4: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/4.jpg)
Syslog messageThe syslog message has the following ABNF [RFC5234] definition:
SYSLOG-‐MSG = HEADER SP STRUCTURED-‐DATA [SP MSG]
HEADER = PRI VERSION SP TIMESTAMP SP HOSTNAME SP APP-‐NAME SP PROCID SP MSGID PRI = "<" PRIVAL ">" PRIVAL = 1*3DIGIT ; range 0 .. 191 VERSION = NONZERO-‐DIGIT 0*2DIGIT HOSTNAME = NILVALUE / 1*255PRINTUSASCII
APP-‐NAME = NILVALUE / 1*48PRINTUSASCII PROCID = NILVALUE / 1*128PRINTUSASCII MSGID = NILVALUE / 1*32PRINTUSASCII
TIMESTAMP = NILVALUE / FULL-‐DATE "T" FULL-‐TIME FULL-‐DATE = DATE-‐FULLYEAR "-‐" DATE-‐MONTH "-‐" DATE-‐MDAY DATE-‐FULLYEAR = 4DIGIT DATE-‐MONTH = 2DIGIT ; 01-‐12 DATE-‐MDAY = 2DIGIT ; 01-‐28, 01-‐29, 01-‐30, 01-‐31 based on ; month/year FULL-‐TIME = PARTIAL-‐TIME TIME-‐OFFSET PARTIAL-‐TIME = TIME-‐HOUR ":" TIME-‐MINUTE ":" TIME-‐SECOND [TIME-‐SECFRAC] TIME-‐HOUR = 2DIGIT ; 00-‐23 TIME-‐MINUTE = 2DIGIT ; 00-‐59 TIME-‐SECOND = 2DIGIT ; 00-‐59 TIME-‐SECFRAC = "." 1*6DIGIT TIME-‐OFFSET = "Z" / TIME-‐NUMOFFSET TIME-‐NUMOFFSET = ("+" / "-‐") TIME-‐HOUR ":" TIME-‐MINUTE
STRUCTURED-‐DATA = NILVALUE / 1*SD-‐ELEMENT SD-‐ELEMENT = "[" SD-‐ID *(SP SD-‐PARAM) "]" SD-‐PARAM = PARAM-‐NAME "=" %d34 PARAM-‐VALUE %d34 SD-‐ID = SD-‐NAME PARAM-‐NAME = SD-‐NAME PARAM-‐VALUE = UTF-‐8-‐STRING ; characters '"', '\' and ; ']' MUST be escaped. SD-‐NAME = 1*32PRINTUSASCII ; except '=', SP, ']', %d34 (")
MSG = MSG-‐ANY / MSG-‐UTF8 MSG-‐ANY = *OCTET ; not starting with BOM MSG-‐UTF8 = BOM UTF-‐8-‐STRING BOM = %xEF.BB.BF
Until researching into this I thought
Syslog messages had fixed structure and
could be then handled with fixed database schema.
Then I read the RFC5424: http://
tools.ietf.org/html/rfc5424
Here we have once again parameters, although they are within one defined STRUCTURED-
DATA field.
So could NoSQL be used also for Syslog?
![Page 5: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/5.jpg)
So what happens next?• Selection of NoSQL database:
• Likely Column Family Store if no one can suggest a better one?
• Something easy to setup and use, will concentrate into getting RADIUS server and/or Syslogd transferring data to database.
• Setting up a WiFi access point and/or controller to provide real RADIUS and Syslog data
• Storing data, retrieving data, searching data, deleting data to see what works
• Writing and presenting Part II: “Implementation and Results” of these slides
![Page 6: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/6.jpg)
Results (hopefully)
• Is storing RADIUS accounting and Syslog messages into NoSQL database: a brilliant idea, brilliantly stupid idea or something else?
• How hard can it be? What does it require to do this, is it possible and how?
• Does it actually work? What can you do with data? Is there some indication of performance improvements or problems?
• Will not do complete performance measurements though, designing and setting up reliable measurement environment will probably take too much time.
![Page 7: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/7.jpg)
Using NoSQL databases to store RADIUS and Syslog data,
part 1I: The Saga Continues
Karri Huhtanen27.11.2012
![Page 8: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/8.jpg)
Happened earlier
• currently RADIUS accounting data is stored usually in SQL databases with fixed database schema
• for Syslog messages an SQL database can be used, but commercial log analyzers (like Splunk) usually use their own solutions which may or may not be SQL databases
• Started thinking if NoSQL database could be applied to both or one of these?
![Page 9: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/9.jpg)
Results (luckily)
• Is storing RADIUS accounting and Syslog messages into NoSQL database: a brilliant idea, brilliantly stupid idea or something else?
• How hard can it be? What does it require to do this, is it possible and how?
• Does it actually work? What can you do with data? Is there some indication of performance improvements or problems?
• Will not do complete performance measurements though, designing and setting up reliable measurement environment will probably take too much time.
a good idea
easy, 1 night before presentation required
Yes. Store and Process. Unknown.Some issues to be considered.
Coded one Python script.
![Page 10: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/10.jpg)
So what happened?• Selection of NoSQL database:
• Likely Column Family Store if no one can suggest a better one?
• Something easy to setup and use, will concentrate into getting RADIUS server and/or Syslogd transferring data to database.
• Setting up a WiFi access point and/or controller to provide real RADIUS and Syslog data
• Storing data, retrieving data, searching data, deleting data to see what works
• Writing and presenting Part II: “Implementation and Results” of these slides
MongoDB
Done, but not thoroughly
Done
![Page 11: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/11.jpg)
That was the executive summary. Thank you.
![Page 12: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/12.jpg)
Now some more detailed information and even some code.
![Page 13: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/13.jpg)
storing RADIUS accounting and Syslog messages into NoSQL database
• It is a good idea because:
• When we have massive amount of log or accounting data, we need massive database clusters.
• Data is mainly stored, read, analyzed and occasionally deleted. Data will not be updated or changed and is relatively simple (few tables with a lot of columns).
• NoSQL may provide better way to scale this horizontally by distribution and sharding.
• It is already being done. Several log analyzers, stores already use NoSQL databases as backends. There exists projects such as Greylog2 etc. which provide complete solutions from log storage, visualization, analysis etc.
• Logs and accounting data are actually use cases for some NoSQL databases, for example: http://docs.mongodb.org/manual/use-cases/storing-log-data/
![Page 14: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/14.jpg)
storing RADIUS accounting and Syslog messages into NoSQL database
• It is not a brilliant idea because:
• If we look what we need to do to optimize the performance it starts to look like a lot like designing and optimizing a SQL database: http://docs.mongodb.org/manual/use-cases/storing-log-data/
• You cannot forget datatypes or database design even with NoSQL databases especially when going into production.
• Prototypes may be faster and easier for developers, but creating a design and configuration which survices production use may be as hard as it has ever been. The difference is that instead of SQL database expert, you know need a NoSQL expert.
• ... but it is not a brilliantly stupid idea either, it is an idea worth considering depending of the project.
![Page 15: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/15.jpg)
How hard can it be?
• With Ubuntu Linux Server 12.04 LTS:
• sudo apt-get install python-pymongo mongodb syslog-ng syslog-ng-mod-mongodb
• for Syslog-NG, just some configuration
• for Radiator, some configuration and coding an external Python script to handle accounting messages
• But this is far from production use, it is more like proto or proof of concept implementation done in 1 work day.
![Page 16: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/16.jpg)
Demo
![Page 17: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/17.jpg)
Syslog-ng# /etc/syslog-ng/syslog-ng.conf
# mongodb log destinationdestination karrin_net_mongodb { mongodb();};
# ...
log { source(s_src); source(s_net); destination(karrin_net_mongodb);};
# that’s it
https://www.balabit.com/sites/default/files/documents/syslog-ng-ose-3.3-guides/syslog-ng-ose-v3.3-guide-admin-en.html/reference_destination_mongodb.html
![Page 18: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/18.jpg)
Radiator RADIUS server
# /etc/radiator/radiator.cfg## send all RADIUS accounting requests to external script#<Handler Request-Type = Accounting-Request> <AuthBy EXTERNAL> Command %D/acct2mongo.py </AuthBy> AcctLogFileName %L/acct-acct2mongodb-%Y-%M.log</Handler>
![Page 19: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/19.jpg)
#!/usr/bin/env pythonfrom pymongo import Connectionimport datetimeimport sys
def main():
line = str() post = dict()
# opening connection connection = Connection( 'localhost', 27017) # database 'radius' db = connection['radius'] # collection 'accounting' collection = db['accounting']
post['acct2mongotimestamp'] = datetime.datetime.utcnow()
for line in sys.stdin.readlines(): pieces = line.split(' = ', 1) if len(pieces) == 2: post[pieces[0].strip().strip('"')]=pieces[1].strip().strip('"')
collection.insert(post) connection.end_request() connection.disconnect()
# 0 Means reply with an acceptance. For Access-Requests, # an Access-Accept will be sent. For Accounting-Requests, # an Accounting-Response will be sent. return 0
if __name__ == '__main__': main()
acct2mongo.py
![Page 20: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/20.jpg)
Does it actually work? What can you do with data?
• Yes it does actually work, but once again it does not solve or be applicable to everything.
• One can store, read, search and delete data supposedly very efficiently, but anything more complicated is harder and must be implemented by developer.
• For example: MongoDB does not have a reliable decimal datatype. It is better to keep numbers as a string and convert them when processing data.
• Repeating earlier statement: “You cannot forget datatypes or database design even with NoSQL databases especially when going into production.”
![Page 21: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/21.jpg)
Performance?
• Would need to be measured and verified and with real production environment or solution.
• Would also need to be compared with well designed and optimised SQL database, maybe even one functioning as NoSQL one.
• In the implementation this was not tested as the datasets were very small compared to real datasets.
![Page 22: Using NoSQL databases to store RADIUS and Syslog data](https://reader034.vdocuments.us/reader034/viewer/2022051210/54bd81ff4a795929228b4584/html5/thumbnails/22.jpg)
Conclusions• NoSQL should be at least considered as an option
when designing and implementing large scale Syslog or Radius Accounting storages.
• For development it is flexible.
• For production use NoSQL solution still needs design, careful planning and testing to verify if the performance, reliability and security is enough. Probably as much as SQL database design.
• Key issue will probably be can the SQL database handle the data or is horizontal scaling required.