The OpenNET Project / Index page

[ новости /+++ | форум | теги | ]

Увеличение производительности (apache, sendmail, samba, kernel, i/o) (tune speed optimization samba apache sendmail kernel io)


<< Предыдущая ИНДЕКС Поиск в статьях src Установить закладку Перейти на закладку Следующая >>
Ключевые слова: tune, speed, optimization, samba, apache, sendmail, kernel, io,  (найти похожие документы)
Subj : Увеличение производительности (apache, sendmail, samba, kernel, i/o) ------------------------------------------------------------------------------- Apache: -------- If you are experincing long delays while "Waiting for reply...." with Apache servers you may try the following hack I use frequently to speed up your systems performance. Apache comes with a limit of 256 Child Processes defined in its hard server limit (in the source code). This can kill apache when it is used for a very very busy site. Normally in the httpd.conf file you can only set a maximum of 256 MaxClients. This is not enough for a busy site. To change this limit (which the makers of apache suggest as well), edit src/includes/httpd.h and change the value of HARD_SERVER_LIMIT. Bump it up to 1000, then set your httpd.conf files MaxClients to like 512. That should be ample. Re-compile apache and you are done. Note: Linux itself has a "Max Processes" per user limit. Add this to your root .bashrc file (or whatever script your particular shell uses): ulimit -u unlimited You must exit and re-login before starting your new apache! Otherwise you will run into problems. To verify that you are ready to go, make sure that when you type ulimit -a as root, it shows "unlimited" next to max user processes. (note: you may also do ulimit -u unlimited at the command prompt before starting httpd instead of adding it to the .bashrc file, but I always forgot, so I just added it in the .bashrc file as a safety net.. Another good place to put ulimit -u unlimited is in the httpd startup file in /etc/rc.d/init.d) ------------ Компиляция программ: -------- To squeeze the most performance from your x86 programs, you can use full optimization when compliling with the -O6 flag. Many programs contain -O2 in the Makefile. -O6 is the highest level of optimization. It will increase the size of what it produces, but it runs faster. Also, you can use pgcc to compile stuff for extra optimization on 586 or later cpu's. More information can be found at http://www.goof.com/pcg/ ------------ When compiling, use -fomit-frame-pointer. This will use the stack for accessing variables. Unfortunatly debugging is almost impossible with this option. Do not use shared libraries: shared libraries are roughly 10% slower, because they need to access variables via a base register. There is some overhead for relocating jumps into a shared library. But shared libraries can increase the performance when the system is short of memory, because shared executables are generally much smaller. For numerical applications: Avoid numeric exceptions. -------------- Sendmail: -------- Here's the key to configuring sendmail for optimal speed and security. Don't use the m4 configuration system. It's overly complex, and allows for too much cruft (like the old % hack, and uucp). I took the time to write a custom sendmail.cf, and have found it much easier to put in the features I want while keeping a secure shop. --------------- Samba: -------- Add these socket options to your smb.conf file: TCP_NODELAY IPTOS_LOWDELAY SO_SNDBUF=4096 SO_RCVBUF=4096 Try testing a large (I recommend 8Mb) file copy operation to a local disk before and after the change. --------------- With a network card of 10Mb/s, copying 8Mb file from Win NT4.0 on a 586 AMD 133MHz to RH Linux 2.2.10 on a PII 400MHz using Samba version 2.0.4b using TCP_NODELAY on its own took 12.39s but with IPTOS_LOWDELAY SO_SNDBUF=4096 SO_RCVBUF=4096 it took 16.02s. Here is part of my smb.conf file (just in case...): [global] workgroup = WG server string = Samba Server encrypt passwords = Yes smb passwd file = /etc/smbpasswd log file = /var/log/samba/log.%m max log size = 50 security = user socket options = TCP_NODELAY domain master = Yes local master = Yes preferred master = Yes os level = 65 wins support = Yes dns proxy = No name resolve order = wins lmhosts bcast host interfaces = 192.168.1.1/24 127.0.0.0/8 bind interfaces only = Yes hosts allow = 192.168.1.3 127. hosts deny = ALL debug level = 1 ------------------ Ftpd: -------- ftpd is bottlenecked by the inetd process, which requires the creation of child processes for each incoming ftp request, and further creation of processes during each directory listing. One of the most immediate ways to improve FTP performance is to replace the standard ftpd server process with a high performance "no-forks" variant. By pre-spawning the ftp server processes, and additionally handling (and caching) directory listings internally, a significant amount of process creation overhead is avoided. One well known such product is NcFTPd. While not free software, it does come with a no-cost license for personal use or .edu domains. NcFTPd and more information about it can be found at http://www.ncftpd.com/ncftpd/ Comparison metrics between NcFTPd and wu-ftpd can be found at http://www.ncftpd.com/ncftpd/perf-linux.html Name lookups will be quicker if you insiall a local nameserver, and forward requests to your ISPs DNS server: In /etc/resolv.conf add the line: nameserver 127.0.0.1 And in /etc/named.conf add: transfer-format many-answers; forward first; forwarders {x.x.x.x; y.y.y.y}; Where x.x.x.x and y.y.y.y are the IP addresses of your ISP's DNS server and another DNS server respectively. There are two really simple configuration changes that can lower the cpu/memory that named requires to run. i.Use of the listen-on directive in the options section of the named.conf file. For each interface a nameserver listens on, a pair of filehandles is opened. On a busy nameserver, saving every filehandle is a big win. ii.Turn off unnecessary zone transfers! It's often interesting to measure the number of zone transfers any nameserver that is authoritative for a zone actually does. I've measured 65,000 in an hour, and blocking the unauthorised transfers knocked around 15% of CPU usage off of the named process. (down to 10% of a UltraSPARC 1). Do this either in the firewall or you can use the allow-transfer directive in the options section of the named.conf file and watch the number of zone transfers that result. These are common configuration "misses" that really do make a difference in the way that the daemons run. -------------------- Linux kernel: ------------- kernel network There are a few settings to be tuned in /proc/sys/net/ipv4 for intranet servers (and, unfortunately, benchmarks). Linux 2.2.x supports, and utilizes TCP options fully, including window scaling, TCP timestamps, and selective acknowledgements. These help to get much better general responsivity/bandwidth utilization on typical internet links, i.e. lossy, congested, long-delay, erratic connections. But using these on an intranet where you can keep an eye on congestion might not help things. When you are sure that almost all traffic is going to be on good Ethernet, or similar high quality, high bandwidth networks, you might want to try disabling these one by one, or all of them (but don't forget to check if you really get better results, usually it's not detectable, if this is the case, revert to the defaults, they are better for the network). Selective acknowledgements and window scaling is expected to not improve results on LAN's, and aren't either costing much more than a few bytes per connection (usually), but time stamping might get better performance on longer transfers anywhere. Not using TCP options at all (i.e. disabling all of them) is a very-very little bit easier for the kernel. To disable TCP timestamps: echo 0 > /proc/sys/net/ipv4/tcp_timestamps To disable window scaling: echo 0 > /proc/sys/net/ipv4/tcp_window_scaling To disable selective acknowledgements: echo 0 > /proc/sys/net/ipv4/tcp_sack To enable either one, just use echo 1 like above. In 2.2.x kernels, all of them are defaulting to 1 (enabled), which is generally the best for internet connected hosts. Currently Linux ala Redhat 5.2 has a default limit of 256 IP aliases... bad news for Web hosting based on IP address. You can change the alias max by modifying the following file: /proc/sys/net/core/net_alias_max -------------- I/O: -------- I've got 2x performance increases on massive disk I/O operations (like cloning disks) by setting the IDE drivers to use DMA and 32-bit transfers. The kernel seems to use more conservative settings unless told otherwise. The commands are # /sbin/hdparm -c 1 /dev/hda (or hdb, hdc etc) to use 32-bit I/O over the PCI bus. (The hdparm(8) manpage says that you may need to use -c 3 for some chipsets.) Use # /sbin/hdparm -d 1 /dev/hda (or hdb, hdc etc) to enable DMA. This may depend on support for your motherboard chipset being compiled into your kernel. You can test the results of your changes by running hdparm in performance test mode: # /sbin/hdparm -t /dev/hda (or hdb, hdc etc) When you've found the optimal settings, you should consider doing a # /sbin/hdparm -k 1 /dev/hda (or hdb, hdc etc) to keep these settings across an IDE reset. I've seen the kernel reset the IDE controller occasionally and if you don't set -k 1, the other settings will be reset to defaults and you'll lose all your performance gains. The -m option can be used to change the number of sectors transferred on each interrupt. You may get additional gains by tweaking this, but it didn't do anything for me. -------------------- You can use the -X12 option of hdparm to set the PIO mode of the disk to PIO mode 4. Should get you a slight performance gain. Only use this if you have mode 4 disks. Also, once you have a set of hdparm options that you are happy with, don't forget to put the line in your /etc/rc.d/rc.local file to run it every time you reboot the machine. -------------------- It's a good idea to let everyone know that kernel has 16 bit EIDE support turned on by default. Setting the -m option in hdparm to 16 saves you 16 interrupts to move 16 blocks because now 16 blocks are being moved per interrupt instead of default 1 (if I am not mistaken). Setting -m to 16 could actually be detrimental for slower machines, -m 4 would be perfect for my old DX50, YMMV (benchmark!).This option gives me several additional Mb/s on my ALi chipset machine (ASUS P5A board). I get 12 Mb/s with hdparm -c1 -m16 -a128 compared to around 6 Mb/S with defaults on a 10G WD drive. Notice, this is WITHOUT UDMA support (I can't find ALi M1541 UDMA patches anywhere (though I've heard they exist!)).

<< Предыдущая ИНДЕКС Поиск в статьях src Установить закладку Перейти на закладку Следующая >>

 Добавить комментарий
Имя:
E-Mail:
Заголовок:
Текст:




Партнёры:
PostgresPro
Inferno Solutions
Hosting by Hoster.ru
Хостинг:

Закладки на сайте
Проследить за страницей
Created 1996-2024 by Maxim Chirkov
Добавить, Поддержать, Вебмастеру