Author Topic: Architectures for Clarion/Nettalk applications (Read 6758 times)

Keith · « **on:** July 10, 2013, 03:57:09 PM »

I just wanted to understand if there were any limitations on deployment and the number of users that could be supported and the transaction volume for our beautiful NetTalk applications.

I have attached a document because its content didn't do too well in this box

I would appreciate comments. I sort of think that everything that I've said is possible but it would be nice to be sure.

Cheers

Keith

[attachment deleted by admin]

kevin plummer · « **Reply #1 on:** July 10, 2013, 06:16:32 PM »

Finally, there is also the question of how to deploy multi-tenanted applications in a SQL (or even TopSpeed) database environment - whether to hold different client’s data in the same or separate tables and how this would translate into deployment in the database tier.

> depending on how complex your application is and how you want it to scale, setting up separate SQL DB's seems to be the preferred method. I would also avoid hosting MSSQL and your web app on the same server as SQL loves RAM and will eat it all up on a server. Also the faster your web server (CPU, RAM) the faster it can process a request (Thread). I think there is a max of 100 concurrent threads before NTW passes back the server too busy page to the user.

Bruce · « **Reply #2 on:** July 10, 2013, 09:14:19 PM »

Hi Keith,

I think you've pretty much covered all the bases, and the doc seems correct as far as I can tell - no major points jump out as being "wrong" anyway <g>.

Probably the single biggest "limit" right now is Ram. Not ram of the machine but Ram for the process. Assuming it's running on a 64 bit OS the server can use a max of 4 gigs of ram, although it's probable that windows will spit you off before then.
Ram is consumed by the server in a few ways;

Each incoming request is received into Ram, and then passed to a new thread for processing. Typically requests are small, very small, but it is possible to upload a large file, and this is part of the request. For example if someone posted a 3 GIG file up, then this would cause the server to consume 3 gig of Ram. (1) Clearly a couple simultaneous posts like this and the server crashes. (2)

As mentioned each request is handled by a new thread. Starting this thread consumes some ram (which is released when the thread ends - typically less than a second later.) The amount of ram consumed depends entirely on the number of global threaded variables, structures and objects that you have. (3), (4)

Sessions also consume some Ram, but typically this is not enough to make a difference to the overall usage in a material way. You can reduce the sessions by reducing the session-timout time on the Advanced tab in the Web server. But this is likely to be the least useful optimization, unless you are really storing a lot of data per user in the Session queue.

Monitoring the stats on the Performance tab of the web server gives you insight into all of these parameters - tracking the max value for sessions and threads, and also noting the number of times the server returned a "Too Busy - Try again Later" error to a request.

Notes:
(1) There is a property of the Server (not Handler) class called MaxPostSize which lets you explicitly limit the size of incoming posts. This can be set on the Security tab of the Web Server procedure.

(2) A 64 bit version of Clarion would remove the Ram limit completely (for all practical purposes) and is one of the reasons you should email SV encouraging them on their 64 bit project. Apparently the ability to make 64 bit programs is a work-in-progress there, but it's useful for them to know that this feature is desirable.

(3) FileManager objects, and the memory for each File Record are the most obvious consumers here, so ram usage is typically in proportion to the size of your dictionary.

(4) The number of threads running at any one time is limited by the server property MaxThreads. This defaults to 100 but you can change it to better suit your app if you like.

cheers
Bruce

Bruce · « **Reply #3 on:** July 10, 2013, 09:30:04 PM »

>> Finally, there is also the question of how to deploy multi-tenanted applications in a SQL (or even TopSpeed) database environment - whether to hold different client’s data in the same or separate tables and how this would translate into deployment in the database tier.

While there is no "perfect answer" here, I think the ideal is "both". So plan the tables, and browses and so on as if the data is all mixed together, but in practice set it up to that company data is separate.

a) Separate gives you a lot more control over each companies needs. In other words companies that use the system a little can use a shared db server, whereas companies that use it a lot can ultimately be easily moved onto their own server. Special cusotmers could even be given their own dns name (eg macies.whatever.com and bloomingdales.whatever.com and so on) and routed to separate web servers, and separate db servers.

b) From a security point of view, if the databases are distinct from each other it's harder to make a "sql mistake" that affects all customers. And it's not possible for one customer to "accidentally" see another companies data. Backups and Restores (especially restores) become a LOT easier because they can be done on a company-by-company basis rather than for the system as a whole.

c) "moving" data (for example allowing a special customer to run their own copy of the server and database in-house) becomes easy to do if their data is already separate. Giving them just their data for say importing into their own db for reporting or whaatever is also a lot easier because you can safely give them the "whole database" and not have to run a process to "extract just their data".

d) If you wanted to give a company access to _your_ database, but only for their data, then that's also do-able if the databases are separate.

e) "Shared" databases make sense when you have trial periods, short-term customers and so on. So your "long tail" of users that signed up but never used the system and so on can all share the same db. As soon as a company is using it "for real" you can have a process that moves their data into their own db.

f) To make your life a _lot_ easier down the road - I strongly recommend using a 16 char string as the row id field, not an auto-number field. This allows data to be merged together for reporting purposes later on, or if 2 companies want to merge their data (eg if one buys another) and so on. As you grow merging and splitting data become more common requests, and they're trivial to do if the db allowed for it up front, and really hard to do otherwise.

g) add a "company" field to every table, and filter every browse on that (set it as a session value when the user logs in.) In most db's this field will have all records with the same value, but it gives you flexibility if you do the "trial database" thing, or if you want to ultimately merge data later on (and be able to tell which data is which.)

So I suppose I'm answering your question like this - plan your dictionary right, and you can defer the answer to this question to some later time when you may want to strongly go down one or the other path. Since you don't really _have_ to make the choice now, defer it till later.

cheers
Bruce

peterH · « **Reply #4 on:** July 11, 2013, 01:35:11 AM »

This is an interesting discussion! I like Bruce's idea of "doing both" when it comes to multi-tenanted apps. Let all the trial customers and maybe also the small ones share one db and move the bigger ones to their own. Nice strategy.

There is however one very important aspect missing in the discussion: how to handle db maintenance (in the broadest terms).

If you end up with several hundred or maybe even several thousand data bases you must have a plan in place for things like doing data base changes and even regular maintenance and backups. Remember, there's only one program shared by all and if a new version requires changes to existing data bases you must make the changes "all at once" if possible. Otherwise you'll have to work out a plan for doing it in a way that doesn't disturb the customers (too much). I'm still considering how to do this in the most efficient manner so thoughts are very welcome

.

Peter

Keith · « **Reply #5 on:** July 11, 2013, 01:58:09 AM »

Thanks for all of the comments. If I understand correctly the main limit then is that the exe itself can only be 4GB because the Clarion compiler is 32 bit (2**32 = 4GB). So the application can accommodate requests for more Ram because of new threads but up to that limit.

If this is correct and we were contemplating deploying with say 2000 concurrent users then that would seem to imply that we would need lots of smaller servers each with say 6GB of Ram so that there was some left over for the OS etc AND no MSSQL DB on the server (Kevin).

This would be a bit of a pain with physical servers because their capacity would be wasted but would be sort of ok in say a VMware environment where you could define a number of 2 core virtual servers each with 6GB of Ram (inside much larger physical servers). But I agree, we need a 64 bit application and not having one is a negative. I will write to SV.

Bruce's DB design guidelines look good too and as he notes would be great for merging companies - think of the money that is spent now doing that and it mainly relates to fixing up the databases.

I will need to think about PeterH's question about DB maintenance - I don't like the sound of "Remember, there's only one program shared by all".

Keith

Bruce · « **Reply #6 on:** July 11, 2013, 02:33:07 AM »

>> If this is correct and we were contemplating deploying with say 2000 concurrent users then that would seem to imply that we would need lots of smaller servers

no, this is an incorrect conclusion. Remember we're not talking about ram per _user_ here, but ram per thread. A thread lasts around one tenth of a second, so if you had support for say 100 threads that's roughly 1000 requests per second. Your 2000 users would need to be generating a request every 2 seconds to get near to that number.

If you planned to have 2000 users "using" the system (at the "same time") then let's assume they do something every 10 seconds. In truth that's a LOT faster than they're likely to be using it. So that's about 200 per second.

but if the 2000 users are really spread out over the day, each using it for say 2 hours in the day, then you can divide that by 5 again (assuming a 10 hour day).

In other words what you're really planning for is "number of requests per second". - and that has to do with usage patterns, not the absolute number of users.

But wait, there's more.

Each _process_ on the computer can consume up to 2 gigs of ram, but you can run multiple processes on the same box. For example, if your box had multiple IP addresses you could use www.whatever.com and www2.whatever.com and then run a copy of the exe, each bound to that IP address. So you don't necessarily need to have multiple serves to support multiple instances of the EXE.

If you can't get multiple IP addresses, then multiple ports can be used (although that's not as "clean").

>> But I agree, we need a 64 bit application and not having one is a negative. I will write to SV.

Be careful of over worrying about this limitation at this point in time. I've mentioned it because it's the only "absolute" limit that I'm aware of (everything else depends ultimately on the power of the box, and the size of your bandwidth.)
Our server is currently handling around 20000 hits per day, and it barely gets above idle. (We have a large number of devices that "phone home" with diagnostic and backup information every day - or sometimes multiple times in a day).

>> Remember, there's only one program shared by all and if a new version requires changes to existing data bases you must make the changes "all at once" if possible. Otherwise you'll have to work out a plan for doing it in a way that doesn't disturb the customers (too much).

In practical terms timing is everything. And some of the other effects of scale ameliorate this to some extent. For example;

client A now has (effectively) their own DB and Server instances. So they can be updated independent of the others, at a time most suitable for them.

Bear in mind that Clarion allows for the program to use a "subset" of the fields on the server (in SQL). So if you are needing a new field, BX then that can be added to the databases (using a simple SQL script) well before you actually deploy the new web server exe.

If you have a load-balancer then you can
a) update the backends so they are ready
b) take one server offline, update the exe, restart the exe
c) rinse and repeat with other servers.

If you have a single web server then yes, at some point you need to
a) stop it
b) copy on new exe
c) restart it.

This can happen quite quickly (within a few seconds) especially if the data is "pre-prepared". And it's possible to "dump" the sessionqueue to say an XML file when closing the server, and re-load it on open, so users don't actually notice the changeover (even if they are literally doing something at the time.)

Of course getting this fancy pre-supposes that you don't have "quiet times", perhaps weekends or at night, when actually pretty much no-one uses the server anyway.

In short - it's not that hard to handle this case, and if you are scaling up that large there are pretty straight-forward things you can do to make it relatively comfortable.

cheers
Bruce

peterH · « **Reply #7 on:** July 11, 2013, 04:54:00 AM »

Hi Bruce,

Thanks for your insight & ideas, most appreciated. I particularly like the dump-the-session-queue trick although I might never really need to use it

Peter

NetTalk Central

Author Topic: Architectures for Clarion/Nettalk applications (Read 6758 times)

Keith

Architectures for Clarion/Nettalk applications

kevin plummer

Re: Architectures for Clarion/Nettalk applications

Bruce

Re: Architectures for Clarion/Nettalk applications

Bruce

Re: Architectures for Clarion/Nettalk applications

peterH

Re: Architectures for Clarion/Nettalk applications

Keith

Re: Architectures for Clarion/Nettalk applications

Bruce

Re: Architectures for Clarion/Nettalk applications

peterH

Re: Architectures for Clarion/Nettalk applications