Author Topic: How to spec a Nettalk WebServer (Read 11765 times)

de la Rosa · « **on:** January 24, 2014, 12:23:12 AM »

Hi,

Is there a rule of thumb on how to specify the minimum requirement of Hardware Specifications for a Nettalk WebServer? Say for 100 clients?

TIA,
Vic

Bruce · « **Reply #1 on:** January 24, 2014, 08:19:08 AM »

Hi Vic,

100 clients is basically nothing, so a pretty low-spec machine would probably be fine. Once you get into the tens of thousands of users territory then you want to pay a little more attention. Of course it depends to some extent not on the number of users, but rather how busy those users will be.

Like with most computer things, the drive speed and network speed will likely be the main bottlenecks. If you're actually buying a server then an SSD will be nice. Any modern cpu, with say 4 cores, will fly along very nicely.

cheers
Bruce

Vic · « **Reply #2 on:** January 24, 2014, 03:19:45 PM »

Hi Bruce,

Oh, so Nettalk 7 already works like an IIS or Apache then? I remember in the earlier days when you mentioned over a hundred will be an issue already because of the spawning. Anyway, my sever application will be responsible for data logging from no more than 500 sites, but will be updating like at intervals of 5-10 secs per site. However, the data payload is no more than 2k bytes. Other than that, some browsing and editing of records. I was thinking if it would be necessary to group the sites in clusters and assigning different ports for each cluster? Although I think the frontend might not really be an issue now as you said but the updating of the backend DB. About 3/4 of the traffic will be updating and reading while the balance will be inserting records.

About your suggestion of using SSD, wont it be necessary to eventually archive or mirror to HDD?

Thanks,
Vic

Bruce · « **Reply #3 on:** January 24, 2014, 11:18:31 PM »

>> Oh, so NetTalk 7 already works like an IIS or Apache then?

I'm not sure what context you mean by "works like" - but NetTalk has always been highly scalable and works int he same way as IIS or Apache in the sense that it supports lots and lots of users using a "Session" approach rather than firing up an exe for each user.

>> I remember in the earlier days when you mentioned over a hundred will be an issue already because of the spawning.

I think maybe you misunderstood. The main constraint in the system is Ram, specifically the ram that can be accessed by a single Win32 exe (which is about 3Gigs or so on a Win64 OS). The primary consumer of ram is a thread, because all your global threaded variables and objects get memory allocated to them for each thread. The number of possible threads is thus _very_ dependent on your program but a rule-of-thumb is that 100 threads at a time is usually fine.

Note that threads does not = users. Threads are created to handle a request, they deliver the response and then die. The life of the average thread lasts about 1 tenth of 1 second or so. So, in requests-per-second you're looking at about 100-200 rps easily.

>> Anyway, my sever application will be responsible for data logging from no more than 500 sites, but will be updating like at intervals of 5-10 secs per site.

That's very very different to 100 humans accessing the site for a limited number of hours per day, and wit a _much much_ slower access rate.

500 sites every 5 seconds is 100 per second - and that is fairly significant traffic. Especially if you plan to do any disk access - and especially if you plan to do any disk writes for each of those requests. Also 100 x 2Kb = 200K per second of data, and while that's not enormous you might want to make sure your connection can cope with it. Assuming they log for around 12 hours in the day that's about 8.6 Gigs of traffic per day as well. So factor the bandwidth for that into your equation as well. If it's 24 hours a day then 17gigs a day, or about 500 gigs a month. If one quarter of those are inserts then your drive needs will grow by say 125Gigs a month, or about 1.5 terabytes per year.

I'm not sure what backend you're using, but hopefully it's SQL - certainly TPS won't cope with these numbers at all. You should then run your SQL database on a separate machine, and it will need a bunch of grunt as well.

So in short, for just the web server, you probably want to spec a fairly high-end machine, and especially a very fast disk-subsystem to copy with the amount of data which is flowing in and out. Or possibly, as you suggested, split the logging across multiple computers (that won't change the bandwidth requirements though). You're looking at Windows 2012r2 server (which has good disk-aggregation abilities).

And I guess, if you need this to have "mostly all the time" uptime, then you're going to need backup systems for when a hard-drive fails and so on. RAID for sure, but even RAID mostly just protects the data, it will take time to replace, and rebuild broken drives - especially drives of that sort of size.

>> About your suggestion of using SSD, wont it be necessary to eventually archive or mirror to HDD?

SSD is a drive just like any drive, but faster and (usually) smaller. But SSD won't be useful anyway for the data sizes you are talking about. And I hope you've got lots of backup planned, and ideally mirrors and so on. Thrashing the disks this hard will inevitably result in regular failures.

As you can see, if you change the spec, the requirements change. But if you just do the math you can figure out what you need.
In your case it sounds like a pretty powerful monster. (Or re-architect to split the load across multiple, smaller, cheaper machines.)

cheers
Bruce

Keith · « **Reply #4 on:** January 25, 2014, 03:39:32 AM »

Hi Vic

Just wanted to add another perspective to the advice Bruce has already given.

Yours is potentially a very greedy system. You need to work out the specifications and metrics for all of the components that are required to make your application work. These are (at least):

Ram required based on the number of concurrent threads
CPU based on the number of concurrent threads
Required disk IO rate
High availability requirements
Backup requirements

If the hardware that you deploy is not capable of meeting all of the these requirements then your application will fail.

The first thing you need to know for sure is the actual request rate. You said that the sites would log every 5-10 secs that's a spread of 2:1 so if you can't control the connection rate then you'll have to plan for 5 secs. Bruce has pointed the way to the arithmetic and it boils down to 100 requests per sec and you will need to support the Ram required by those 100 requests, the CPU capacity and the disk IO rate.

Actually, I think you will need up to three webservers (load balanced in some way) and depending on HA requirements, two DB servers - here's why.

You have to support at least 100 concurrent threads. If you look at the specs of an IBM System x3550 M4 Express server you will find that it two sockets (Intel 4C E5-2609) and each of these sockets has four cores and supports 4 threads for a total of 8 threads. This means that only eight of your 100 concurrent threads can execute at the same time which means that to do all of the processing for the 100 requests you have to get through each set of 8 in 0.08sec or 80 m seconds. The actual time is shorter than that because even though each core can support two threads they both cannot operate at the same time all of the time because the threads share the core's memory so its not as good as having eight single threaded cores.

But lets go with 80 msecs because this is just an example. For each thread you have 10 msecs or 1/100th sec to get the processing done. Bruce often quotes 1/10th sec pre request or thread which is 10x more than the time you've got. And that's if you are only doing CPU work - if you add IO then you've time for about 8 IOs for all of the 8 threads which is just 1 IO per thread.

This server could not support your load from a CPU perspective (we'll get to the Io later).

So the point is that this server does not support enough threads. Ideally you need to be able to support 100 concurrent threads and based on this server you would need 13 of them! But you do need to think about availability and if you can't have the web server dying (which means logging is lost) then you'll need at least two and if you can't have any data lost then you'll need enough servers to be able to do all of the work even though one server fails. There are a number of things to consider but this is about service, up time and availability.

Here are the considerations: if you deploy 1 web server that can handle the peak load and it dies then no work gets done. If you deploy two servers that together can handle the peak load and one dies then half of the work does not get done. If you can't afford those scenarios then you will need at least 2 servers configured so that 1 server can handle all of the load or 3 servers where 2 of them can handle the peak load (and trust that you won't have two server outages the overlap).

As Bruce has already said you will not be able to support the DB sizes using TopSpeed files (assuming the estimated data growth) so you will need MySQL or equivalent. Another reason for having a SQL DB is that you will need to do backup and the consideration here is whether the application and DB have to be available 24x7. If they do then you will need to be able to place the DB in 'hot backup' mode (an Oracle term I think) so that you can backup at a point in time while still processing. If you have the luxury of a period during the 24 hours when there is no work to do then you can take the application down and do the backup in peace. But remember that backup software consumes lots of CPU and disk IO resources which is another reason why you aren't going to get away with a single web server that does everything. And sometimes the backups would fail and have to be done during peak processing times!

So you'll need a separate DB server. Then the same availability questions as for the web server apply here. If you lose a single DB server then no work gets done no matter how many web servers you've got. So you'll need 2 if you can't afford to lose data. The DB layer now starts to get quite tricky and expensive and I won't go into it more other than to say that I would choose a Unix-based DB server which will be more stable than a Windows box and depending on your requirements you may be able to get away with 1 DB server. If you can't then clustering of the DB backend has to be considered and deeper pockets.

The IO rate has to be supported. You need to work out how many IOs per second are required then a suitable disk array can be configured. If your environment is Write intensive then you will say a 6 disk system configured as Raid 0+1. Here, if each disk is 2TB you will have about 6TB of usable space and each disk may be able to do 100 IOPS. Whether this is adequate or overkill depends on what you need. But what is not optional is some Raid arrangement - you have to be able to keep processing after a disk failure. Raid 5 is the cheapest but there is a heavy Write penalty and more risk that a Raid 1 (or 0+1) system.

Finally, a server. The example server above was way too small. An IBM System x3690 X5 has two sockets, 20 cores and 40 threads and four of them would give you 160 threads and you could lose one server and still process 120 threads. an IBM System x3850 X5 has 4 sockets 40 cores and 80 threads and three would allow you to lose a server and still handle the peak load. These are seriously large and expensive servers but then you seem to have a seriously large load.

Cheers

Keith

Bruce · « **Reply #5 on:** January 25, 2014, 09:00:23 PM »

I agree with Keith that it's all about doing some tests, and then doing the math. However I'd like to clear up a couple of finer details, hopefully without being too picky.

Firstly the "tenth of a second" statement should not be considered as a CPU requirement. Each case is different and you should test your own situation. In some of my tests a 4 core machine can handle over 200 requests per second - in others it's less. Mostly it depends on the amount of disk access required. So when I say a thread lasts "a tenth of a second", I say that to give people an understanding of multi-thread development. It doesn't necessarily mean that a thread consumes 100ms of processing power.

Regarding ram - ram for the web server is trivial to calculate because of the 32 bit limit. If you allow 3 gigs for the web server and 1 gig for windows you're basically done. Ram is cheap so for a physical machine just running a web server 4 gigs is a no brainer. The ram for a database server is a different question.

Cheers
Bruce

de la Rosa · « **Reply #6 on:** January 26, 2014, 12:09:35 AM »

Hi Bruce/Keith,

Thank you both so much for the clarifications. I'm just glad that I can do it with Nettalk and don't have to deal with IIS or Apache. It just makes supporting it so much simpler and all within our control. Luckily the sites are also Nettalk based so we have control too. Yes, the backend is SQL although eventually the data will end up on BIG DATA managed by Optimus. Hopefully, we can just ride on their DB infra but otherwise your suggestion on using a Unix based DB is invaluable specially when reports come into play, another set of hardwares. Btw, at these request rate, do you think a persistent connection will be more efficient that an open/close one?

Thanks,
Vic

de la Rosa · « **Reply #7 on:** August 20, 2014, 03:57:11 PM »

Hi Bruce/Keith,

I implemented a 10-site system as described in earlier post above however with a faster sending rate of 0.5 secs. It worked fine for months until I built a WebService that sends out the records of a file in one go (aprox. 150 records and 60k bytes for the entire file). Now it occasionally crashes. One telling sign is that sometimes reading off the records from the backend will not complete and without posting any error, meaning the routine:

Next(row)
If Not Errorcode()
xml.save(row)
else
stop(error())
end

will not go to stop(error()), but the soap will not complete either. So am wondering if this is a case of the backend choking? and how to find out what causing the occasional crashes.

Thanks for any comment,
Vic

kevin plummer · « **Reply #8 on:** August 20, 2014, 05:46:27 PM »

When this happens I would look for any locked process's on your DB server. There are some things you can do in NT to avoid the threads being locked. That said and based on what you are processing, I'm surprised this problem has not come up before.

de la Rosa · « **Reply #9 on:** August 20, 2014, 10:44:23 PM »

Hi Kevin,

Am using MSSQL 2012 and Windows 7. The DB Server is working normally, it's my WebServer application that stops running or has this unable to read thru the returned rows completely.

Thanks,
Vic
.

kevin plummer · « **Reply #10 on:** August 21, 2014, 12:01:36 AM »

Hi Vic,

NT does this because it is waiting indefinitely for a lock on an SQL table to be lifted.

Next time it happens try running this query or just google "how to list locked tables in MSSQL". If the lock originated in NT, killing the app will free the lock.

select
object_name(p.object_id) as TableName,
resource_type, resource_description
from
sys.dm_tran_locks l
join sys.partitions p on l.resource_associated_entity_id = p.hobt_id

de la Rosa · « **Reply #11 on:** August 21, 2014, 10:16:29 PM »

Hi Kevin,

It looks like that's what is happening as what I did was to temporarily store the data payload in a memory table then write to the DB at a controlled pace, basically De-Randomizing the write requests. No crashes so far.

Thanks,
Vic

NetTalk Central

Author Topic: How to spec a Nettalk WebServer (Read 11765 times)

de la Rosa

How to spec a Nettalk WebServer

Bruce

Re: How to spec a Nettalk WebServer

Vic

Re: How to spec a Nettalk WebServer

Bruce

Re: How to spec a Nettalk WebServer

Keith

Re: How to spec a Nettalk WebServer

Bruce

Re: How to spec a Nettalk WebServer

de la Rosa

Re: How to spec a Nettalk WebServer

de la Rosa

Re: How to spec a Nettalk WebServer

kevin plummer

Re: How to spec a Nettalk WebServer

de la Rosa

Re: How to spec a Nettalk WebServer

kevin plummer

Re: How to spec a Nettalk WebServer

de la Rosa

Re: How to spec a Nettalk WebServer