These are unedited transcripts and may contain errors.DNS Working Group session:
CHAIR: Hello. Well, it's another ?? the second session of DNSSEC. I will remind you that this is recorded, it's also broadcast, so don't say things you don't want to have people to hear and stuff.
Before we start, there's a request from the RIPE NCC that people, members who are going to the membership meeting really should register in advance because there's no time between the NCC Services Working Group and the members meeting to do the registry stuff. Be sure you've registered if you want to be at the membership meeting.
Now, we're going with YADIFA, and it's part of our product announcement refreshing series.
SPEAKER: Good morning. My name is Peter Janssen. I'll do a quick up Kate on the YADIFA, which most of you have been in the last RIPE meeting will have heard about that. For those that don't, the YADIFA is a new name DNS implementation. The basic idea was if there is there's always room for another one. And the good faith in the last RIPE meeting there was room for two other ones. We'll be talking about another DNS implementation, that'sy YADIFA, I'm a geek for acronyms. There's a slide there with design goals. If you're using you should be able to do a DNS replacement.
Where are we? . The nitty gritty, the binary packages are available on our website for the platforms that you can see there, the most obvious choices for the moment, I would say, LINUX and free DS n, adding the missing algorithms we don't support yet, some tools to control the daemon to make it start and stop and fancy thing. And the most important part which will be the rest of my presentation which is the NAMEX zone functionality we're building in. We might go into a caching revolver and that will most definitely come than the zone back file that it's current you is porting. And the hot announcement is yes it will be BSD open source but only in June 2012 and there's specific reasons for that.
The last presentation in RIPE 63 I gave a few slides on the... another way at looking at performances, how long does it take to reach a small zone, 198 million lines, BIND, NSD compared with the load face 40 minutes and we're able to do that and it's 0.8 and we were able to do that in eight minutes 26. 100,000 zones, you would have a lot of zones there, what we did is 100,000 zones with seven resources there. Still we've spent a loot of time trying to make it as performance as possible even in the start up phase.
As I said for the rest of the presentation, the dynamic provisioning. The idea we had is you have a running nameserver in production and you want to add a zone or remove a zone you don't want to bring down the nameserver, change the configuration and bring it up again, specially if you're zoning so the idea we had is you need to be able to add and remove zones on the fly, without dropping any queries, and preferably also in a centerly managed system. So we looked at RFC 2136, can we extend is this to add the updates of the zone but the zones themselves and the configuration that goes with it. If you look at the general configuration, we have a couple nameservers up and running, no zones, a minimal set of access control lists. You send the dynamic update message about ABC, it's not a content of the zone but actually the configuration of the zone itself. So you shoot in the dynamic update message to a nameserver, you have the master, these are the slaves. At that moment in time, that nameserver configures the zone, dynamically, becomes a master and sends out another file to the slave. Not a notify about the content but the zone itself was added or removed. And the nameserver generally what it does when it sees a notify, turns around and goes back and says hey, I want full info on this zone, again not go about the content but the zone itself. It would receive an AXFR or IXFR file. Who are the slaves, what are the nitty gritty details. Similarly being shoot it up to any other nameserver, now you are a slave of this zone, go and talk to the master to get the contents. How do we sigh this? We looked at the RFC. This is an extract. Essentially the update looks what is depicted there. The header part is not interesting except the up code will be five, nothing useful to say there. For the zone, what we would put in there, we're talking about ABC .eu, and we defined a new class to talk about configuration of zones and we, for obvious reasons use text decimal A to depict that. Prerequisite, when you're adding a zone, the prerequisite is the zone can't exist. When you're removing it, a prerequisite, it should exist. The most interesting part is the update section. This is an extract from the RFC. There again we defined extra types. On the next slide I'll go in further detail about these. The class is to a hexadecimal, and the R data is on the next slides, what that would be. We have a new type that we defined, zone type. When you're sending a message, hey you should know something authoritatively about some zone and you are a master or a slave. Zone file which will contain the full path name where the zone should be saved once the nameserver gets the contents. Zone notefy, the IP addresss and TSIG that should be othersed to send between masters and slaves. If you're configuring a slave, you need to know who your master is, you should update section where you say the master is this IP address, potentially using TSIG. I wanted to do a live demonstration, but I only have ten minutes and I'm approaching the end. We opted not to do it that way. If you want to see this working, come up to this. We have it up and running. You can play with it. So let's see, what were the chain of flow be? You send a new configuration, the nameserver doesn't do anything with it. You send out a query?like message. We call it spades but it's the same thing. And you send out a message over class 2 A. Merge. The dynamic configuration you just received merge it now with your actual configuration and from that moment on the nameserver would actually start responding to queries for that zone. Obviously there's no content in there that you do with normal updates, send and receiving messages. We envisaged having other things, status update, where are you, things like that. All over the DNS protocol which makes things simpler for operations, ACLs, etc., one slide to say it all. That is the website. Mailing lists. We're running in production. It will be open source from June onwards. Please talk to us. We're happy to get your opinion on this. Thank you.
CHAIR: Any questions?
AUDIENCE: : You know we have a patent on this idea, right?
Peter Janssen: You have a patent?
SPEAKER: This is something we implemented some time ago and as soon as people saw that it was useful, people wanted to do it in different ways. So I there was this effort born around the usual suspects called the nameserver control protocol. There was a requirements document and then there was a follow?up Internet draft defining first steps. Would you care to join, please?
SPEAKER: Absolutely. Again, this is something a request from operational people that when they're running BIND or NSD or any of the usual suspects, when you configure a zone, the whole hassle of getting the name master on the slaves, stopping them and start being them, this is something when we're looking at this, this is an idea that we have. We have implemented something, it's very alpha, but yes, the dream, if you would call it, is this becomes a standard of all nameservers to configure them in general. So absolutely, yes.
AUDIENCE: Lars Liman from Netnod. When you say this becomes a standard, do you mean your idea becomes a standard or there will be a standard for doing this.
SPEAKER: You want an official response or off line response
SPEAKER: We would love yes, this is a good idea, go there and we had some responses from some of the of that think this is a good idea. The general idea is this is a community effort and whatever comes out on top, we would implement on that, if you would be a firm believer in that as well.
AUDIENCE: A totally separate question. Is this IPv6 compliant, both in transport and content?
SPEAKER: And also DNS
SPEAKER: Yes. I went over the first slide, any DNS, DNSSEC three, all that is part of the deal.
AUDIENCE: Robert Packet Clearing House. I think it really looks very promising. I just wonder a little bit about what happens when you lose one of these transport packets.
SPEAKER: For the moment it's TCP no, UDP, so you have to do a TCP connection where you have the obvious disadvantages of retransmit and all that. It builds up a configuration but doesn't make it live until you give it the magic word, which is merge, before that you can check the configuration and if it received everything in completion before you kick it in and make it life.
AUDIENCE: If you have connects to your node in Ukraine or something, you have to manually pick up on that.
SPEAKER: Yes and no, only on the master which normally sits pretty close to home. The slaves would use AXFR and IXFR to get new things in in which would be a full complete configuration and we would bring it life or it wouldn't and if it wouldn't, it would receive another file and come back to you or it would sometime later.
AUDIENCE: All these configuration things you can secure with TSIG.
SPEAKER: Yes and talking to the masters you are, that you could use ?? again, we're not there yet. This is alpha and I wouldn't even call it alpha. It's working for a limited set of things. Again it's referring to what how was saying, yes, we want people's opinions before we go any further.
AUDIENCE: And finally you trance fit the TSIG to transfer that clear text over the wire, right?
SPEAKER: The idea ?? again, this is not clearly defined yet. The ideat is to get the updates of the zone once it's configured would be sent over to the slaves, but the initial securing of master slave configuration would be different. There is a minimal set of pre?configuration that you need to do on all nameservers to talk to each other securely.
CAIR: I'm afraid I have to stop any questions now. All these details are a community effort, other people talked about.
Okay. Thank you.
CHAIR: The next registry with it's own authoritative nameserver, will give us an update about that.
SPEAKER: May. We have two kids at the block and now we have four. So this is a DNS update. If you are not in Vienna, here is a short introduction but I recommend reading the slides down there if you're interested in more detail. It's open source only DNS server for TLD but we want to target everybody else as well. It's portable. It has run time configuration and supports standards. Here's the update, what we have new from the state, we were in Vienna to version 1.0.3 now. We just released yesterday, we now have TSIG and access list . The scripts right now ultimately compile the zone if it's new. There's security improvements, dropping and using LINUX capabilities. We also improved a little bit of memory requirements for redeeming.
Here's a rough road map. We would like to speed up IXFR and we did some of that work, it turned out it's a lot of memory, so we optimised it a little bit. And we will also focus on stability and bug fixes. And I would say version 1.1 will be almost production ready. Right now it's testing release, and I would advise ?? I would be very grateful if you can take the code and test it because as you know the developers are quite disconnected from the real operations.
The version 1.0 which is planned in the other half of the year, we want to add dynamic updates. We already have some code but we're merge that. Every time I speak about KNOT?DNS I'm asked if there's manual for that. We plan to provide the documentation like the reference manual. We also want to ?? I'm answering the question which was asked for YADIFA, we are the DNS CCM and we plan to integrate that into KNOT?DNS and for the last part of the year, we would like to optimise for huge number of zones to make it more speedy for large number of zones.
Here's a crystal ball department predictions for the next year. While there are some points we would like to have, like DNSSEC resigning, we would like to reduce memory footprint, optimise performance and there's a thing that right now you control the DNS daemon by using signals and we want to improve on that. For the other features, well, we are ?? we are accepting ?? your wishes, like on the pictures. So talk to us. We would be happy to speak with you about the features which we would implement because we really want this server to be useful for the community.
Just some numbers. This is the testing framework. As I said there are four servers which are usable or will be usable for TLDs. We tested last versions of what's available. On the Linux we also tested KNOT?DNS in two flavors, one is normal compilation and the second one we used link time optimisations which did some optimizations: There's a test zone. There are two million of random mix of unsigned records. Test queries, half of that is queries which hit the records in the zone and half of them are triggered in X domain. Roughly one million queries. A little bit of warning. This could be biassed. It was done by ourselves. We don't want to be biassed but we could be. We have started an effort to move our collaborative effort and we asked DNS org to be that platform which we can talk about DNS bench marking which would create methodology for testing DNS servers.
So the first test is DNS PERF based. We did more iterations to stable lies the results. The independent variable is threads and processes and we used binaries on the site so we have no control on anymore thread this used. So take that when you look at the picture into act. Dependent variable is queries per second. We tested on Linux. Here is the first picture. The link time optimisation added something like 10,000 queries per second which is quite nice. And it could be useful for other DNS servers.
For the benchmark on free BSD, the results are quite similar to what I shown ?? to what we shown in Vienna. There seems to be some limit on free BSD which caps note and NSD.
The second benchmark, we used the benchmark YADIFA published on their web pages. Independent variable is queries per second and dependent variable is percentage of lost queries. We did two runs. And the last one on the graph is with ?? to those reply there's option top speed which hits the server with all it can. This is the response rate on Linux. Here the difference is not much. The optimisations are quite good. I'm quite impressed with YADIFA results as well because I don't think there's much difference between NSD, YADIFA or knot in the response rate. The queries per second, which you reach quarter a million, it's quite high. This is the free BSD and while surprising free BSD behaves much more better, so I think it was raised two days ago on the DNS benchmarking, even we had ?? if the operating system can have influence and I think this is the answer, that it can ?? it's run on the same hardware and the free BSD is much better in the networking.
Here's the resources. We have a web page, issue tracking is his, source code available already, you can play it it, send patches, if you find a bug. We have a mailing list. Questions?
AUDIENCE: Jim Reid. You mentioned earlier that the latest version of the software has got support for the root zone. Can you explain what you mean by that?
SPEAKER: I think knot version 0.8 wasn't even able to load the root zone.
CHAIR: Any more questions? In that case, thank you.
Hair chair next speaker for another update is Jacob Schlyter for DNSSEC.
JACOB SCHLYTER: So this is a short status update on the ?? from the open DNSSEC project. So this presentation is quite similar to the one we had one years ago, so you've probably seen most of it before. Open DNSSEC is a turnkey solution for DNSSEC. It's a signer only. We still don't have any more alternatives or competitors or whatever you would call, open we have a BSD licence and we have it now. Key features: We have a policy driver configuration where we enter parameters that are pre?configured and after that we run run the configuration and things that change. This is a feature that I believe has ?? could get more attention and it's a lot difference between how open DNSSEC is run compared to other softwares. We still have support for PKCS no. 11. Only ones that can works with HSMs and we have. We support key sharing between zones and we scale to about 50,000 zones. I wouldn't advise you to run more in a single instance, it will hurt you in some way.
Just to mention our contributors just recently the last six months joined by .se. We now have a none profit company in place. It's a company with limited ?? it's non?profit in Swedish. The company is in place to give long?term support for open DNSSEC who will have support contracts and training classes and consulting services and what have you. And we will also secure funding for a future development. But still, it's non?profit. We do have an architecture board, if you have any more strategic issues on open DNSSEC, please feel free to contact any of those. We usually meet at the RIPE meetings. We met yesterday. You probably know most of the people here.
Since the last RIPE meeting, we have done major updates to the documentation and also the issue tracking. The documentation used to be adequate, it's now, I think, quite good. We do have documentation for older versions and newer versions. And we've also set up an extensive Q and A environment where we test all platforms on all versions, which is really good in finding direct problems doing development proper /SEUSZ and when people upgrade packages and what have you. If you want to run open DNSSEC today, I would recommend you run 1.3 which is the latest stable production quality version. It's multi??threaded signer. I know some TLDs are running earlier versions. There is an upgrade path. If you have any questions, talk to us. There are good examples of people who can upgrade and this is documentation how to do that. Don't feel too afraid to upgrade. I would expect 1.3 to be stable for quite some sometime. If you're going to do new deployment, this is the one you want to run. Mid summer we're going to release 1.4. It might look like a small change but not we're added Internet greated support for AXFR and IXFR. This means the zones will be in memory at all time. This will change the memory footprint of your DNSSEC installation. Be aware because the 1.3, the batch lining signed and dumped the file to disk. This moves to memory. Keep that in mind. This also means you don't have to have BIND or NSD just to serve the zones of the file of your signer. And we can get rid of the incoming AXFR. We've also dropped the auditor. We still think it was a good idea. It's played it's part. We find a lot of problems in both the auditor and the signer in the last two years. Some people have complained about the dependencecys that we have Ruby and different libraries. So we decided it it's time to drop the auditor. The source is available if you want to use it. We don't provide any support for it unless we get paid. But basically, if you want to do auditing use something like DNSSEXY or something instead. There are hooks in open DNSSEC to do auditing of the sun file you want to do so but the actual auditor is removed from 1.4 and on. I'm not sure if anyone will miss it.
We have a 2.0, the holy grail, hopefully to be released the end of the year. This will be where we refactor the enforcer, the policy engine that configures and creates keys and decides on signature lifetimes and things like that. We'll support algorithm rollover. We'll have support for unsigned zones which may sound funky but we have large DNS providers who want to have would be path for both signed and unsigned zones. They want us to be able to take an unsigned zone, do nothing and deliver it on the other side untouched. Strange? Yes. It makes sense sometimes. Why you want to have something unsigned I'm not sure. We'll be able to transition between NSEC and NSEC 3 and back. Something that might be used for some TLDs that currently use NSEC.
Beyond that, we don't have a firm ?? we have a road map, we don't have any commitments. We'll have database input and output. People seem to like that. Dynamic requirements. We'll have an improved CLI, the current one kind of sucks. And we need to have a good API for integrating into production environments. I'm not sure if that will be the nameserver protocol but something along those lines.
We need feature requests. We don't need them but if you have feature requests, I'm here and Patrik Wallstrom is here. Stand up and wave. So please tell us what you want and when you need it and why and even if you have some motivation for us to do it.
We do have education. We have several scheduled training sessions in Stockholm we're doing training locations in various locations around the world. They're reasonably priced. In Stockholm they're free. You have to pay for your airline ticket but then they're free, even a free lunch. And we provide on?site training, we've done various places. Talk to us if you want to have training on site or at some other location. The stud material is available for free. Free licence, you can use it. We'll publish the PowerPoint source soon. If you want to do internal training sessions for your own staff, just use the stuff.
That was it. Do we have time for questions?
CHAIR: Time for questions.
JACOB SCHLYTER: Yes, any questions? No questions means it's perfect.
CHAIR: Perfect. Thank you. Well done.
CHAIR: And you now it's Robert. He's going to tell how to do Atlas and DNS and how they come to
ROBERT KISTELEKI: I'm Robert from the RIPE NCC again and I'd like to give you information about what you can do with RIPE Atlas in terms of DNS structure and what kind data we provide you. RIPE Atlas runs on tiny hardware probes you can install in your home, business networks, wherever you want to. When you plug these guys in they right away start doing measurements and some of those measurements relate to root nameservers. We do a couple of queries to all route nameservers and to get basic information about the reachability. Based on that we provide you some useful information and summaries. This map has been presented at the previous RIPE meeting already, but we have enhanced it a bit. What I would like to show you is what you can get out of it.
This is a map of K?root in particular but you can select any root server if you go to the Atlas website and go to maps in general. What it shows is the probes on the map, so the host of those probes actually tell us where the probe is physically installed and when the probes make those queries, we get the responses and we color these pins and tell you what they see. If you control down you can see a key, for example, for K?root, that these instances were scene by RIPE Atlas in Europe or in north America or Asia and so on. You also get some details if you click on one of these guise and you'll see in which AS and what's the response. In this case, the question is host name.BIND, you will get that as well as some reply times. So far so good this was presented a half years ago. But we added a couple new features. In particular we added a couple more queries that we visualized as well.
So now, for example, you can see this is marked as experimental at the moment because we're just presenting it. But it's working pretty stablely. Now, you can look up what version of the servers these probes see. You can see the colours change but that's because the key has changed. We see a couple of NSDs and a couple other interesting responses. There are two probes both in Vodaphone which give weird results and I'm told that the reason for this is because there's a weird version of open WRT and it gives you something else than you would expect. If you're a root server operator you can look this up. We do UDP and TSP based queries to get the serial numbers. That seemed to be like a simple idea. And it's very powerful already. If I do UDP, I wanted to get what is the set of serial numbers for the root zone on K?root that these probes see. As you can see almost everyone sees the same thing, which is 20120418 which is today but there are outliers and lo and behold they match up with the probes. There's an outlier in China. That particular probe in China for some reason is always one day behind in terms of serials. Now, you can come up with conspiracy signatures, how long does it take to break the signature on the root zone. We don't know. But it's just a fact. And I think the point is here we're presenting facts as we see them flowing out from the RIPE Atlas network, with the intention that it's useful for you to determine everything is okay with your nameservers. I can do a similar thing on TCP. Same answers. The two misbehaving ones are gone now, either because they're giving the same answer or because they don't give an answer at all. If you look closely you see some shadows here. This is a probe there that doesn't get an answer. That's one of the interesting maps. You can get this for all the root serves on v4 and on v6, if it's available depending on the root server operator and the setting. On v6 we have 500 something answers and on v4 it's close to 15 hundred.
Another one that is interesting and it was presented as well, is a comparison map. So on this one, we actually again put all the Atlas probe results up there, but for each probe we do a comparison between the servers in terms of response times. And we put the label of the faster root server on the pin. But if you click on them, you can also get all the details, over v4 and over v6. There are some number of probes where v6 is generally faster and consistently faster than v4 which is a good thing. If you're interested you can look this up and see how your root name server, if you're operating one, relates to others. All right. And this as well works on v4 and v6.
All right, so again this is just for reference, if you're looking at the slides this is what I shown and these are the queries we can visualize on the map. Why is this useful for root DNS operators? You request compare the response times to other operators and deduct points of present for the future. You can also see which name serve versions are visible, which serials do we observe inside Atlas. You can check whether UDP versus TCP is successful and we have seen this as a request on the mailing list that it would be so nice to have a map that directly compares this and probably gives you the locations or makes them very visible, which are the locations where UDP succeeds but TCP fails and those are the ones you want to contact the network operator or do some reserve to determine why this is. This is something we want to do.
We have heard it's probably useful to do checks from the probes like what's the maximum packet size that can go back and forth. If you feel this is a good idea, please tell us and provide support ?? yes, there is one support, thank you, preferably written support on the mailing list so we can embark on this and do it.
Obviously the root server operators is one set of the crowd but you guise, you're most likely operating DNS servers or live close to DNS servers as well. What we want to do is make Atlas for you as well. In that sense, what we have done so far already is we have released the so?called UDM or user defined measurements to all Atlas. If you're a user, you can schedule your own measurements. At the moment, those, it is con trained to ping and trace so in some instances that gives you information about how the world can reach your nameserver even if it's not a root nameserver but it would be much more useful if you could do DNS queries. And the message here is that is coming. In the near future we're going to release the DNS queries in the UDM framework. You will be able to exactly the same queries we're showing you for the root servers about your own servers. This is most useful if you're running on Anycast nameserver but it could also be be useful if you're running Unicast. Once we can really release this, you will be able to do this for yourself and we will provide you similar maps as before.
The RIPE NCC membership consists of not just DNS operators so we would like to map the Atlas and everything coming out useful for everyone else. So we have been thinking about what are the potential benefits for general ISP who is not DNS operators in particular and having had a couple of chats with you guys, one conclusion, it would be very useful if I can look at this map and see where I am as the operator and how do I compare with my friends around me or in the same country or whoever I want to compare to myself. So we would like to do that. In practice that will mean you'll be able to select your own AS when looking at these maps or the probes that are close by and look at where they end up when they do DNS queries. If you see funny things like you're in the UK and your users end up in Japan when they want to query a nameserver there may be something you want to do about it but at least you'll have the data to prove that in there.
That's all I wanted to say the the moment.
CHAIR: Thank you. Any questions? I note that China was actually a time on IPv6
ROBERT KISTELEKI: It's entirely possible.
AUDIENCE: I think you said is doesn't currently support DNS queries over TCP but that's on your wish list?
ROBERT KISTELEKI: That's not exactly true. I may have said that but it's wrong. We do serial queries on both.
AUDIENCE: Roland van Rijswijk, SURFnet. I was just wondering how you are going to make sure that everybody gets a slot if they all want to do their queries on the probes. If loads of people show up and want to do queries, you may end up running out of time on the probe to do the actual queries. How are you going to manage that?
ROBERT KISTELEKI: In terms of what and how we schedule on the probes for the exact measurements, we have a schedule algorithm that says these probes should do this type of measurement, these that type of measurements. I don't expect that all the probes will participate in all the measurements. But it's true once we reach a certain stage, you could use hundreds, maybe more probes to do this. So the answer is two fold. We have to make sure that the probes are not overloaded in terms of how many measurements they do and that work of the host is not overloaded as a consequence. They also want to make sure the destinations are not overloaded. We don't want to make the RIPE Atlas as the ultimate network BotNet for dossing DNS servers; that would be bad. We are going to deploy rate limits to a particular destination. I hope that answers your question.
AUDIENCE: You might do a credit system?
ROBERT KISTELEKI: In terms of who can do so much. We have a credit system, you have a bunch of credits that you can spend on measurements and depending how often you want to measure, how much you want to measure, you can spend more or less, so any single user of the Atlas system cannot overload or cannot claim all the capacity, let's put it that way.
AUDIENCE: Okay. Thank you.
AUDIENCE: Philip question from a remote participant: They want to know how DNS relates to Atlas specially when it comes to user defined measurements.
ROBERT KISTELEKI: That's a very good question. At the moment DNS MON is up and running. I encourage everyone to come to the services working up, Daniel is going to give an update on how we imagine the future of DNSMON. In terms of Atlas it's almost able to do everything DNSMON can do at the moment and it's likely RIPE Atlas will be used as a successor of DNSMON. For the details please come to the services and Working Groups and even the working buff.
AUDIENCE: Peter Koch, DENIC. You mention rate limiting. Have you thought about an out list for user defined measurements?
ROBERT KISTELEKI: Yes, not particularly DNS measurements, but this is something that we would like to see if there is a need for that, it is entirely possible to do that. I think it's going to be the ultimate discussion between opt?in versus opt?out. Do we want to by default allow people or by default not and then opt in, opt?out. So the short answer is at the moment there's no way you can opt?out and the reasoning for that is we believe that DNS measurements are not sensitive in this sense. For other kinds of measurements that we may or may not do, we will probably add these kind of features.
PETER KOCH: I would appreciate sharing it because DNS org has a do not probe list that was set up because of some of the measurements that have been done but that is probably long since, so...
ROBERT KISTELEKI: The way I understood your question is more like opt?out from a particular type of measurement but you're absolutely right there is a feature which we are thinking about, it's not yet in, but we could put it in, there might be networks that structurally should not be probed and we may even want to give people the opportunity to say and these are my networks that you shouldn't probe.
CHAIR: Thanks, Robert.
CHAIR: Talking about root servers, the L?root server is coming close to you this summer
SPEAKER: My name is Dave Knight: I work in the DNS operations of ICANN. What we're going to talk about today is some recent changes we made in how we division and deploy L?Root.
Okay, how does this work?
So I'll give a background just to give some context to what I'm talking about and then speak about the expansion we've done in the last few weeks. The redesign that we had to do to make that possible and what further work we still have to do.
For the background, once upon a time, we were limited to have one authority server ?? or one physical server per enter the in the NSRR is the and that had a set of practical limitations. It meant that set number of users would have a high RTT to get to servers. Servers might be less reachable. The scaling and performance characteristic its of this and the growth of the network that happened were not so good. Then we had Anycast. That allows us to get beyond the limitation on the number of servers by making copies and putting them at different points on the topology. Several root servers have been Anycasted for a decade now. A typical model that allows it to do that has a stack of servers and network gear usually hosted at an ISP location. Anycast, let us get back the limitation and number of servers and allows us to take it closer to the servers. It allows greater capacity to handle queries and also offers flexible, when you have many many servers and you can easily drop the advertisements for the service, it makes it an operational decision to take malfunctioning servers off?line. And when we suffer attacks, it keeps the attack traffic closer to the source. If we have ?? I remember a big attack against the route in 2007, a large amount of that traffic came from Korea and we had nodes in Korea and they were able to soak up most of the traffic.
L?Root has been Anycast since 2007. Our original model, we had 3 big nodes, so ten servers in a location with a router, all IXP locations. And we operated that for ?? well, up until last year when we decided to expand. We wanted to expand because those three nodes had us present in north America and one location in Europe. We wanted to increase our ability to handle queries and also wanted to take the service into under served regions for L. So the way we decided to do that was to change from having a small number of big expensive nodes to a small number of large cheap nodes. Operating slot lots of nodes means a significant increase in the amount of work to manage them, we had to do a lot of work to increase the managing of the nodes.
As to where, there's no real good heuristic for locating nodes. The typical model has them next to IXP locations. We're just one of 13 root servers. We could take this opportunity do something different and add diversity. We decided to forget IXP preferences into going into eyeball networks directly in that they provide a server and we operate it. We began a field trial of this last year. We deployed around 30 nodes in 15 to 20 networks. Some operators bought several servers. While we were doing that trial of drolling out the equipment, we were working on the new platform that we'd use to make this scale. We began role out of that flat form the end of February. We were able to deploy 60 new locations. So what we did in March, our model has ?? we deploy a single?box solution. So the host byes one or more boxes, depending on if if they want to put more than one per pot they can. Can he the number of locations buy deploying virtual machines operated by PCH. We rolled out 146 at 40 locations. Each VM is more than a particular server. It gives us the same query handling capacity. This is how L?Root looked before we started this new deployment. We had 68 servers at 37 locations,. So three ?? at this point two of our locations were still big nodes and they have 16 to 20 servers each, which is why we have a bigger number of servers in locations there. During March we deployed 21 physical servers and at the ?? in VMs. We have 97 servers, 146 VM, and 94 locations. We don't intend to stop here. We want to continue deploying nodes. Really, it was all around how do we reduce the up X around running nodes. So our old platform ran Cent OS and we have scripts to ought mate the management of that. We still have had to treat each box as an individual entity that needed its own care and feeding. So our desire with what we were building was to turn this into ?? I hate to use this term ?? turn it into a cloud with a single set of knobs and switches so we could stop thinking of individual boxes as anything special. Our new servers run Ubuntu and that decision was made because it was more plex I believe. Enterprise US gave us stability but we did a lot of hand compiling and with this new things happen quicker. And we like Debian package management more. And we developed a fully automated install and automated administration with puppet which gives us single set of switches for the whole deployment. It allows you to describe a policy for how machines should be installed and configured which it enforces. The nodes check in periodically and receive any policy updates and then apply those.
We manage configuration, we have a single configure file per node which describes the things that are unique about that particular box, which is not much. This is what the configuration file looks like. So essentially it just has location information which we use to populate the map of where these things are, the network information, and a TSIG key for that specific node.
So with do an install. The boxes we use, currently we standardize on Dell servers and they're DRAC management card. Once we created the configuration, we can run an installer, talks directly to the DRAC code, tells the DRAC to mount it and then to boot from that. As soon as the box comes up, it's mostly configured enough to start puppet which starts and completes the configuration. If the box is in the same network it same takes five minutes to install. Within a few seconds it's fully booted and ready to go operational. So we do some checks before we make it live, just to verify that the name server is working and things look sane. Turning it live is a matter of changing one configuration line in the Puppet conflict and the next time it's advertised a service prefix and on line. So I mentioned we use Puppet. One of the nice things can we do with Puppet is each of the configuration elements can be broken down into the modules. We use NSD, we describe what packages are required, how it should be configured. We can right write a template in Ruby. Because this is a module we can have a developing environment and testing environment where we test those changes before we introduce it into the development system. So we monitor ?? currently we use a package called intermapper for ?? to test the availability of the service. Right now we still have to configure that manually. We haven't completed our migration. What we're going to do, the support with Nagios which allows us in the description of a policy for every service you can quite easily say have Nagios monitor the nervous service and for every service configured it starts monitoring it. We haven't done that migration yet. That's one of the next things on the to do list. We also use a package called Observium. Currently that's manually configured. That we ?? as I'll mention in a moment, we intend to make our whole environment use this new stuff. A lot of our back office things we still need to take the configuration manually from Puppet to configure that but hopefully in the next few months all of that will just happen automatically. And we use DNSMON to monitor L. We monitor traffic with DSC which is now starting to cause us problems. The current version of DSC is not good at handling hundreds of serve he's. We hope a new DSC will make this better. We have individual stats for everybody server instance. So I don't have a lot of ?? we haven't had a lot of time to do a great deal of analysis of how things have changed since we've done the mass deployment. Very quick things we don't seem to have attracted a lot of new traffic. The traffic we have is more more thinly spread among the nodes we have. We don't do this right now but we will in the next few weeks is switch on constant collection of packets on every node so that we have something to look back when we see interesting things happening.
So yes, I mentioned further work, we still have it at our old employment of VENTOS nodes. When we did our field trial we started rolling out the hard way wear, this would be fairly trivial to start converting those. As I mentioned the back office services that have our monitoring are still on Cent OS. We have to convert those. Our goal is to get to a point when we deploy a node ?? once we've done a negotiation with a host, once we run the install script we shouldn't have any other manual steps to do. We're continuing to deploy nodes. If you're interested in mosting a node you can talk to me or see contact information on website.
AUDIENCE: Daniel Karrenberg, RIPE NCC. Thank you very much. I like talks that talk about practical experiences. So that's very well done. I'd like to see more of that. As we are operating K currently and thinking about which way to go, go this same way of making hundreds of nodes like you're doing, spread the query load very thinly and get some resilience benefits out of that and some small RTT benefits out of that. I'd like to have some feedback from the DNS Working Group on which way they want to see K go, like it is now more or less or go more in the direction of L.
The question I have ?? this was the question to the group. The question to you: I see that you want to minimize the operational effort. How have you addressed the consistentcy ?? the potential inconsistentcys, the more of these that you run, the likelihood is in some of them, maybe just a small number, they give different answers because they don't update their zones or somebody attacks them or there's just brokeness? How do you cope with the increased monitoring requirements and things like that?
DAVE WILSON: : We have monitoring in place that checks every few minutes so we can go in and fix things. At the moment, we don't see problems like that crop being up very often. You know, naturally if we were to scale up to a thousand nodes, perhaps that might become something more of a burden, but, yes, we ?? I guess we would fix the problems and perhaps we stop growing at a point that that gets too much of a workload. And for each of these boxes, we can connect directly to the machine itself and they all much have the management interface as well. We all have a back door to everyone. Beyond that, maybe there's more to your question.
DANIEL KARRENBERG: The increased monitoring is managable?
Dave Knight: So far and we think it is.
CHAIR: If people have suggestions for the K?root server, where can we best drop that
DANIEL KARRENBERG: I think the DNS Working Group mailing list is a good thing to have a discussion. If you feel it's too detailed for the Working Group mailing list, hit me or Robert or any NCC staff.
AUDIENCE: I was wondering because we're running an Anycast network on a much smaller scale, however what we see is traffic doesn't end up in the location you would expect it to end up. So my question is: How dynamic is your BGP networking set up, do you prepend locations independently and monitor that and try to optimise round trip times? Also do you analyze the topological implication of adding a new node or put it in and wait and see what happens?
SPEAKER: Our decision?making for where we would host is a node is can you buy a box. Pretty much, if you can provide a machine and you can do simple things like talk BBGP to us, we'll probably deploy the node without much further analysis.
And then to your next question, the analysis we do: Really we're looking at traffic. If we see a node that seems to be getting too much or more than we expect, perhaps we would take action. So some of the Anycasting route servers are using high arcal, global and local, we don't do that. All the nodes are global. As to the suboptimal routing, because we're going directly into i ball networks, we haven't done the analysis, but as we spread out that's going to be limited just by virtue of the fact we're inside the networks. I forget if you had another question.
AUDIENCE: That was a good explanation, thank you.
AUDIENCE: Remote question: I would like to thank you for sharing the details and then his question: Are you planning to follow ?? are you going to release management, are you going to update the everybody servers every six months or something like that?
SPEAKER: We haven't decided that. We certainly don't run automatic updates. Periodically we can run an update on a lab machine test that things look sensible and apply those in production. As to making release updates, because of the automation of the installation it's easy to fully reinstall a box in half an hour and have it come on line again. That could be something we could do. We haven't really thought about how we'd handle that yet. Although we want the flexible, we don't necessarily want top slaves to it. If that answers that.
AUDIENCE: Host master, we host two of your nodes and I think we're going to have a third. There used to be a statistic for each node and now it's changed to regional. I know why because you have too many nodes but is it possible to.
SPEAKER: I think if you dig down ?? it's hard to lay this out in DSC but if you look you see the things split by region and then the same things again, if you click on the second set.
AUDIENCE: I click on Europe and then I can see the list?
CHAIR: Maybe you should do this off line and I would like to thank the speaker and the questions.
CHAIR: Now we change to something completely different. We're going to have a panel about DNS change. It will be chaired by Peter Koch and while he's assembling the people I will try to revive people's memory about DNS change. Basically DNS change there was this affair about Botnet which was changing people's DNS servers on the local machines and putting their route DNS servers in it. By dismantling this Botnet, people tried new tactics and the tactics are that actually you didn't remove the route nameservers but these were taking over by ISC and were not answered properly. So that kind of controversy in the community about whether or not this is a good idea and other aspects of it. RIPE also got hit by some interesting court orders. So the idea is to talk about it, and to Peter for the rest of the session.
PETER KOCH: I was asked to wear my DNS working co?chair group hat here, otherwise no hats here, just concerned and maybe inspiring individual looking at the needs in the after math. We're doing this in the DNS Working Group because it's called DNS changer, you but you may find out it's not really a DNS topic, but something about security, Botnet mitigation and not least of all a governance issue. We invited Joao because he's with ISC and they were very active with the DNS change Working Group and website is up there. Before you all go to the website there, wait a second and listen to what the panelists have to say. Brian is a co?chair in the anti?abuse Working Group in RIPE and Jochem is the supervising legal council and has been dealing with what was mentioned the interesting police audit in that case. So picking at random, there were two issues that we're facing or surfacing here. The one is the police order from the Dutch police, so I would first like to ask Jochem to give a short statement or update and then I would like to ask Joäo to explain the international background and then we'll start into an interactive discussion and probably invite the floor as well.
Jochem: Thank you, Peter. So how it went for us, we were contacted from the US by the FBI saying well, we're looking at this. Some blocks are registered with ARIN some with RIPE NCC and we're looking at issues on how to resolve this network. They were talking to IC at that point in time already. We said quite clearly we neededed a Dutch court order to do anything. So they looked into that. Just to skip a bit through time, so they came up with a Dutch order which was based on the police act and we executed the order at that point in time. I'll tell a bit more about it in the services Working Group. What it proved to be after the order that the legal basis of the order wasn't so solid on the Dutch law. So at the moment we've challenged the state on the order and we're in legal proceedings with the state regarding this order and we're anxiously awaiting the outcome, how a judge will look at it. In communication with the public prosecutor in the Netherlands, he threatened us with certain confiscations, whether that was realistic, yes or not, it was forceful measure. We want to get some clarity on that. That's where we are now.
PETER KOCH: So I guess you mentioned the date of March the 8th, right, or that was the determination date that was given in the order?
Jochem: In the order it was 22nd of March but because of the proceedings and the communication we resource versed the order in Jan already so the date has passed from our side. But nothing has happened. I know the order has been extended in the US up to July 9th.
PETER KOCH: Maybe Joäo can expand on that, the technical aspect.
JOAO DAMAS: I'll try and make it short. Basically how we got involved, the people who were involved in tracing these people and taking their business down, realised that the infection that made them be able to operate the way they did was quite wide spread, it tasked not just the TCP but DSL models, they found ways to get in default configuration changing in DSL models. When they were law enforcement was going to take this down, they always realised that if they just switched off the infrastructure completely, these people that were infected and they were counted in the millions, would suddenly lose all Internet access and not know why, probably start calling the ISP and the chaos was going to be too much and not productive. We were involved early on. Some of the people who were involved have been in the security world for a long time, Barry Green, and Marian who is Estonian and Paul Vixey. Okay, what we can do is if we get the list of addresses that these people are using to provide the services, we can operate a name service in those addresses and provide a clean service, so these people ?? the infected can have some time to go fix their configuration without losing service, disrupting ISPs and so on. So we got that service from a court order in the US. The judge appointed us to operate those servers, and they initially give us a period of operation until March 8th. As is quite frequent, people didn't do much in the way of remediation. So we asked for an extension and finally we were granted an extension to July 9th but that's definitely going to be the last extension. What we're trying to do in the mean time is get out and help people as much as possible by making ISPs, in particular, aware, trying to give them some time to plan, to clean, there are different approaches. To anyone who asks we'll give them the list of addresses, if they choose to start running the service internally, which addresses these servers must re, which are recursive, etc., to minimize the impact. It has to be clear to everyone that this redirection of the ?? operation of these IP addresses cannot go on forever. The IP addresses don't belong to IFC, don't belong to these guise anymore, but they eventually might be reclaimed by RIPE NCC. I don't know what the end situation is going to be. We can't pretend that we can spoof the source of those addresses forever and ever.
PETER KOCH: Thank you. Brian, you are one of the co?chairs of the abuse Working Group, and your Working Group is concerned with exchanging information and experience about abuse and abuse mitigation as well as trying to frame or propose policy. What is your perspective on this? Can you put this particular mal wear a bit in context and what it means from your perspective or your Working Group?
BRIAN NISBET: It hasn't been something discussed at any great length between the Working Group. It's another thing people have used to abuse the network, to misuse resources in a way they shouldn't. So from that point of view the specific nature of DNS changer from its technical point of view wasn't of huge interest. It's network abuse and we don't like it. What made it interesting was the way the law enforcement and technical community reacted to DNS changer and what came out of that in regards to the freezing of registry information by the RIPE NCC. So they're the particular areas of interest that we came across. It showed both a positive side and the potentially negative or uncertain side of the community's interaction with law enforcement. I think it was fantastic that the FBI talked to ISC, that they worked with them, that there was and certainly seems to have been good interactions with ARIN and the RIPE NCC, and bar the threatening words from the Dutch prosecutors, most of the seem to be cordial. Then there's the down side, suddenly the world found the NCC was freezing registration resources, and a lot of certains were raised about this and a lot of people were going, oh, how does this happen, how did this certainly happen without a court order, what's going on here? And I think that the fact now, I was very relieved to hear, something we will be touching on the anti?abuse Working Group tomorrow, the NCC and the Dutch police will be going on a blind date to court or whatever way we're going to describe it, to discuss these things and to try to figure out what the exact situation is going to be because what we don't want to have happen from a community point of view is have another exceptional circumstance where something is made up and done. And the vibe I got from the law enforcement people who discussed this, and I am not ?? I do not represent them in the same way I don't represent any specific part of the community, is they were interested in the whole thing, but the quote from the FBI was they didn't want to run DNS servers. They were happy to do this this once but they prefer to extract themselves from the situation as soon as possible. I think it showed great cooperation. It was positive but it highlighted a number of areas that touches on DNS, database, abuse and a wide range of issues.
PETER KOCH: Thank you. I'm trying to summarize my completely unbiassed. I hear this is all great and cooperation and so on and so forth. Two daze ago we were stand celebrating 20 years of successful industry self regulation. That involves court orders, or police orders and that's a bit odd from this perspective. This police order, and I'm only an arm chair lawyer in my part of the jurisdiction. The police orders are usually this type of shoot first ask questions later which is usually applied to remove an imminent danger. I understood the imminent danger was already gone because the bad recursors were taken down so it was the after math. How would that factor in? And the interesting part at the end might be, we've seen numbers like several hundred thousand infected systems and of course the extension of this and the supplement of the recursive servers were to give people a chance to disinfect themselves and the systems disinfected. How successful was that? And Brian already mentioned, can that be repeated in the feature? Jochem, could you start with a policy aspect other that.
Jochem: Policy or police?
PETER KOCH: Both, actually.
Jochem: We learned a lesson here, one of the communications with FBI and ARIN was good, they had detailed communication on the content on the order. We didn't have that on the Dutch side. We were put faced with a yes or no. With hind sight we ?? maybe it would have been better if we wouldn't have done that and not executed the order. I think in the future when it's just a police order and it hasn't been stamped by a court, we wouldn't do it like that. Someone's on the phone. But I think what is key is that we get better communication in the Netherlands about some of these things. Sorry I'm distracted.
So I think those sort of lessons learned by us. We were in contact with our members about this freeze. They didn't comment much about it. They asked what happened and we explained it to them, they said okay, fair enough. So, yes, that's our perspective. We're looking forward to the court case ?? well, Brian says it's cordial but it is a formal court case where we have different opinions. We have to see how that works out over the coming Peter.
PETER KOCH: Thanks. Joäo, what about the success and the remaining systems? Will the sky fall on the ISPs?
Joäo: There is some success, looking at the number of queries that we get from the servers and it's going down, a uniform trend, but there's still quite a lot of work to do on the ISP side. Whether it will be complete by July 9th. I don't know. If I had to bet I'd probably bet on the no side. We have had some ISPs contact us about having more information and try to see how can they best address these for their own customers, which I think is a sign of ?? an encouraging sign that people are taking this seriously. Because the system just works because we replaced the servers with working servers that were clean, some people will go on to more urgent things and will only notice on July 9th. Not more much we can do about it.
PETER KOCH: Brian, any other comments or questions?
BRIAN NISBET: No.
AUDIENCE: Jim Reid speaking as a random guy. I'm concerned some of the jurisdictional issues around this, particularly from law enforcement action and I don't want to put you all on the spot here but suppose, for example, a European law enforcement agency came to me and said do what you did for DNS changers for some other thing how would you react to that if it's not the laws of the Netherland or the laws of the UK?
Joäo I guess that's one bridge we're cross when we come to it.
AUDIENCE: Nice answer.
Joäo: It wasn't a unilateral thing. There were legal processes, different parties involved. I think it should be taken as an example of how things should be done.
BRIAN NISBET: I think and you know a lot more about this than I do. One of the comments that was made by some of the LEA folk involved was that they weren't proposing this as something that was going to be doable or workable every time. They went, this worked we got the right people at the right time and the right place and we managed this but they didn't seem to be putting it forth as a playbook for these type of circumstances.
PETER KOCH: Okay.
RANDY BUSH: Randy Bush. I have negligible understanding of Dutch law but one of the things I've been told is there's a considerable body of process that does not require a court order, administrate tiff law, etc.. and, in fact, I think what happened, if I'm hearing correctly is the police came to your door without a court order and you caved, and when I worry about attacks on the RPKI in Dutch legal system, the example that seems to have been set is exceedingly threatening.
PETER KOCH: Does anybody want to comment or respond?
Jochem: I think that's a fair point in relation to RPKI and that's why now we're following these legal proceedings. The one complexity is the international legal mutual assistance treaty process which actually ?? the prosecutor in the Netherlands has to make sure whether the court order in the US meets Dutch law. And, yes, I can't judge that, I'm not a lawyer but that's the process they follow. That's an international assistance treaty process.
RANDY BUSH: The problem when you take this to the parallel of the RPKI is you will instantly kill somebody's entire business when you cave. It's not like, oh, this is a process that will go on and it can be handle inned the court and there will be a court process, they are dead. You killed them.
PETER KOCH: You sort that out among yourself, like Daniel and Lyman.
AUDIENCE: I want to speak to some. Jochem is so deep into it that he probably doesn't see the forest for the trees. But if you go back to what the board of the NCC and the is essentially we will not cave again. So to answer Randy, we will not cave again. We will insist on the judicial review. We will defend the rights of our members. Of course, you cannot foresee the circumstances of something in the future when the police comes with machine guns and says there's a life at stake, somebody's been kidnapped or whatever, you can make up this type of stuff. We will cross that bridge when we get to it. But there was quite clear statements saying we will not cave again, we will insist on judicial review in the future. And I hope that alleviates your concerns. You can always make mistakes.
PETER KOCH: Okay. Thanks, Daniel.
AUDIENCE: A.S. Leamon from NetNod. You still have the problem of turning this stuff off. What about delaying the responses, more, more, to create gradual pain? That would that won't create a spike or threshold of cold calls to the support desk, but as people react to different grades of pain, maybe it would work.
PETER KOCH: Do you want to have a short response,.
JOAO DAMAS: Yes. Short response: No.
PETER KOCH: And nobody mentioning DNSSEC as a mitigation strategy.
PETER KOCH: I was kidding actually.
So I guess we can thank the panelists for their contributions, thanks to the microphone contractors, job is now going to any other business and I know we're over time.
CHAIR: Yes, we're out of time. Any other business exists? Somebody has his hand up. There's one thing I want to promise to say is that demos about the routers system. If you want to see it speak too Joäo and make it very short.
SPEAKER: Very short, we now have T?shirt and we will be giving them in the back of the room. I have two AOBs. I would like to ask you ?? so for those of you who have attended previous DNS Working Group sessions at RIPE, you might have noted that something was different on this meeting's agenda and you will not win an I pad now if you tell me the difference. Any idea? We left out the reports from ITF and other venues. If you missed them desperately please let us know. If you didn't miss them, please tell us so we can skip them next time as well. And if you have any other concerns please let us. The other thing is we were asked to ask people to register for the AGM during lunchtime.
CHAIR: Once again, if you really want to go to the members meeting, you should be registered and do it in time and not right before the meeting starts. And we are adjourned.