http://clustra.norway.sun.com/twiki/bin/view/HADB/WebHome http://clustra.norway.sun.com/twiki/bin/view/HADB/HadbSystemDocumentation HADB specific user doc at Appserver 8.1 HA Admin guide, chapter 2: http://docs.sun.com/source/819-0216/hadbsetup.html HADB pkgs available from: /net/paradise.sfbay/export/integrate_dock/products subdirs hadb_4.5.0, hadb_4.4.2, hadb_4.4.1 The current release of HADB : Brage : being integrated into JES 5. Appserver version: 9.0 ?? Next release: Idun (Norwegian goddess of youth) for JES 7. For JES6, only minor release containing bug fixes will be released. JES7 build 1 beta targetted at: 1st Oct 2006 For Idun release, major themes are: 1) Multithreading of HADB kernel 2) Internationalization 3) Monitoring: enables users to monitor HADB. PRD Review page: http://clustra.norway.sun.com/twiki/bin/view/DBTGEngineering/HadbPrdReview Derby sample review page: http://clustra.norway.sun.com/twiki/bin/view/DBTGEngineering/DerbyJES6PRDDFinalDraftSep05Review HADB install procedure: - follow link to HADB checkin procedure from home page: - source code in clustra.norway.sun.com; CVS repository: /clustra/cvs % rlogin clustra.norway % setenv CVSROOT /clustra/cvs % cvs co clustra checking out from remote: % setenv CVSROOT :pserver:thava@clustra.norway.sun.com/clustra/cvs % cvs login % cvs co clustra how to use cvs: % setenv CVSROOT /my/dir/cvsroot % cvsinit ; creates subdir CVSROOT, etc. eqvt of sccs create to put files into cvs: % cd /my/src % cvs import -m "my sample prog" project (this will import all files in /my/src/project to $CVSROOT) % cvs checkout project (now project is ready to be updated by checking out *all* files in project) Now you can edit any file. % chmod a+rw a.c ; vi a.c ; and make the changes; To commit a change in one file a.c: % cvs commit -m "Added one line change" a.c To recheckout/reget the updated source: % cvs update a.c % cvs status a.c : displays status: To stamp i.e. to create a branch or label a snapshot use: % cvs tag my-release-1 . all files in cur dir will be tagged appropriately: % cvs checkout -r my-release-1 project ; will checkout this release vs latest % cvs tag t#456789 myfile.c --- To create a branch off of my-release-1 : % cvs rtag -b -r my-release-1 my-patch project % cvs checkout -r my-patch project ; will checkout the new branch. --- To add/remove a new file in module: % cvs add filename; cvs remove filename --- Compiling: - % ./configure % make makefiles % make % make install (the page describes: make makefiles; make; make install ) (use ./configure --help for help) compile in windows machine: ./configure --prefix=//yme/cluwin/$USER/clustra - To make on given platform: % setenv CLU_PLATFORM ?? - After code changes: run MATS (min acceptance test) on 4 platforms: solaris, x86, linux and windows - lecture_1.sxi: Advanced HADB Topics by sivert Sorumgard - management client interface -hadbm - database client interfaces: JDBC type 4 driver (refresh mem on other types) - ODBC (C and C++) - clusql Table fragmentation - by hashing primary key of records - there is primary replica & hot standby fragment replica -updates are sent to the primary node. primary node registers operations in log record and is sent to the hot standby node. The operation is performed on the primary replica within transaction commit. - Node consists of these processes: Node Supervisor (NSUP): clu_nsup_srv Children: - clu_sql_srv - SQL - - clu_trans_srv - TRANS - Transaction - clu_relalg_srv - RELALG - Relational Algebra - clu_sqlshm_srv - SQLSHM - SQL shared memory - clu_noman_srv - NOMAN - Node Manager - clu_resmgr_srv - Resource manager - online upgrade, rolling restart, scale by adding nodes/hosts. - Lecture 2: HADB Management Internals mgt-API HADBM -------------------> HADB-MA (MBean Server) Does hadbm use JDBC driver at all ? yes, it does -- only for setting the db system password: Class.forName("com.sun.hadb.jdbc.Driver"); conn = DriverManager.getConnection(jdbcUrl, "system", currPassword); String sql = "alter user system password '" + newPassword + "'"; Statement stmt = conn.createStatement(); conn.setAutoCommit(true); stmt.execute(sql); Database contains multiple nodes -- A node belongs to atmost 1 database only. Node is not shared among multiple databases even when they are all in same domain. Lecture 3: Node Supervisor- Sivert Sorumgard; - service sets, node sets - history files -- example entries by TRANS server, etc. Lecture 4: Transaction Processing - Roy and Maitrayi - Atomicity - 2 phase commit protocol - Consistency/Isolation - locks * read uncommited; read committed; repeatable read; serializable(NotSup) * No serializability due to lacking range locks -- explain. - Durability - except for double failure connection management: - jdbc or odbc: hadbjdbc4.jar or hadbodbc-mt.so.1* hadbodbc-st.so.1 - (what is -mt and -st ??) - connection pooling at client side is supported for odbc and jdbc 2Phase commit in a distributed env: - First Phase: All operations have been internally done at primary transaction controller and ACK has been received that necessary info have been received at other nodes. Internally mark commit and reply to the requestor as if transaction has been completed committed. - Second Phase: Send Commit command to all nodes and wait for Ack. Once all ack's have been received, it is guaranteed that other nodes have completed the transaction. Lecture 5: Monitoring and Tuning : Maitreyi - debugging HADBM : installpath/lib/hadbmlogging.properties: com.sun.hadb.mgt/cli.level=FINEST; etc. - To trace client connections: hadbm set ConnectionTrace/SQLTraceMode hadbm set EventBufferSize - Detailed debugging of HADB servers is possible only with debug version. default delivery version is without debug. ---- From sys arch document: - MA contains 2 categories of MBeans --(a) agent mgt mbeans (b) DB Mbeans. MBeans are loaded dynamically from HADB s/w pkgs and registered in MBean server?? - agents use cluma protocol for communicating with running HADB servers -windows build of MA interfaces with Windows service manager for startup. -LSARC cases: 2004/450; 2004/687; http://sac.eng/arc/LSARC/2004/ UIRB case 2004/478 (http://uirb.east) - MA opens 2 ports of communication on same port, by default 1862 : * one TCP port that accepts JMXMP connections from hadbm and DbState * one UDP port for multicast communication between agents. - hadbm : CLI framework/commands Main, Command, Framework ---------------------- Adapter (same as adminapi??) ---------------------- Mgt API MAConnection-> ManagementDomain->Database ---------------------- JMX ---------------------- Management Agent HADBManagementMBean, DatabaseMBean, etc. ---------------------- - Agent components: * Defer until you do a jdb of MA agent. - what does node NSUP and NMON do ? * MA starts NSUP(node supervisor) process which in turn forks db processes and monitors the service sets (list of services available). MA retrives this service set from NSUP. * Node Manager (NMON) performs more complex controlled operations. MA asks NMON for database state and to perform stop and restart operations on database or single nodes. - - com.sun.hadb.dbstate.DbState is a small (one file) API for getting DbState. hadb mgt agent funcspec: http://clustra.norway/projects/odin/dbmgt/funspec/agent_odin2.sxw hadbm funcspec: http://clustra.norway/projects/odin/dbmgmt/funspec/hadbm_for_Odin.sxw HADB management system arch specification: ?? JGroups - http://www.jgroups.org hadb system arch specification : http://clustra.norway/intraweb/product/sysdoc/archspec/archspec.html - JSR 47 is for logging API. Pointers: HADB SQL2003 spec: http://clustra.norway.sun.com/sql/library/standards/INCITS+ISO+IEC+9075-2-2003.pdf You can get Volume 1 of the DRDA spec at http://clustra.norway.sun.com/sql/library/standards/DRDA-V3-VOL1-Distributed_Relational_Database_Architecture.pdf ----- Some useful links for MF http://icncweb.france/idd/jmgt/jesmf/instrumTutorial/toc.html ---- How to start rolling upgrade to a new version ? Step 1: First check HADB is running with appserver 7 as follows: $ ps -aef |grep clu_ /sun/appserver7/SUNWhadb/4.4.2-7/lib/server/clu_sql_srv ... step 2: Stop exsiting ma: > /sun/appserver7/SUNWhadb/4/lib/ma-initd stop Management agent stopped. Bye. step3: Change the configuration for starting ma with old configuration.(change ma-init script) HADB_MA_CFG=/sun/appserver7/SUNWhadb/mgt.cfg step 4: Starting new ma with old configuration. > /var/tmp/hadb/SUNWhadb/4/lib/ma-initd start Management Agent version 4.4.2.20 [V4-4-2-20 2006-01-23 09:56:59 pakker@supra01] (SunOS_5.8_sparc) starting Logging to /sun/appserver7/SUNWhadb/ma/ma.log step 5: Execute registerpackage . > /var/tmp/hadb/SUNWhadb/4/bin/hadbm registerpackage --packagepath=/var/tmp/hadb/SUNWhadb/4.4.2-20 V4.4.2-20 Please enter the password for the admin system user:********* Package V4.4.2-20 successfully registered. step 6: Set new package for hadb and rolling startup. Note: This triggers the rolling startup on other nodes ??? > /var/tmp/hadb/SUNWhadb/4/bin/hadbm set packagename=V4.4.2-20 hadb Please enter the password for the admin system user:********* Database attributes successfully set on database hadb. step 7: Remove old reference Note: any reconfig command would do. It just flushes the old references. > /var/tmp/hadb/SUNWhadb/4/bin/hadbm set connectiontrace=false hadb step 8: Unregister the package > /var/tmp/hadb/SUNWhadb/4/bin/hadbm listpackages Package Path Hosts V4.4.2-20 /var/tmp/hadb/SUNWhadb/4.4.2-20 ssc2609,ssc2610 V4.4.2.7 /sun/appserver7/SUNWhadb/4.4.2-7 ssc2609,ssc2610 > /var/tmp/hadb/SUNWhadb/4/bin/hadbm unregisterpackage --hosts=ssc2609,ssc2610 V4.4.2.7 Package V4.4.2.7 successfully unregistered. Confirm the status. ************************************************************************** > /var/tmp/hadb/SUNWhadb/4/bin/hadbm status Please enter the password for the admin system user:********* Database Status hadb HAFaultTolerant > /var/tmp/hadb/SUNWhadb/4/bin/hadbm status -n Please enter the password for the admin system user:********* NodeNo HostName Port NodeRole NodeState MirrorNode 0 ssc2609 15200 active running 1 1 ssc2610 15220 active running 0 2 ssc2609 15240 spare running 3 3 ssc2610 15260 spare running 2 ************************************************************************** Useful commands : $./hadbm-i browsembeanserver Class ObjectName com.sun.hadb.mgt.agent.HADBManagement HADB:type=HADBManagement com.sun.hadb.mgt.agent.Readiness HADB:type=readiness javax.management.MBeanServerDelegate JMImplementation:type=MBeanServerDelegate com.sun.hadb.mgt.agent.Logging HADB:type=Logging com.sun.hadb.mgt.agent.InterMA HADB:type=InterMA com.sun.hadb.mgt.agent.Auth HADB:type=auth $ ./hadbm-i getmbean HADB:type=HADBManagement Attribute Value Host HADB:type=Host,hostname=localhost,id=H_localhost Hostname localhost DefaultSoftwareBundle null ./hadbm-i getmbean HADB:type=readiness Attribute Value Ready false InDomain false ./hadbm-i getmbean HADB:type=Logging Attribute Value ConsoleHandlerLevel WARNING FileHandlerLevel INFO ./hadbm-i getmbean HADB:type=InterMA Attribute Value ReleaseInfo 4.5.0.4 (solaris2.10-SUN57_CC-SUN_C) DomainKey initialDomainKey Ready false Virgin true DefaultDevicePath /export/home/log/thava/var1/opt/SUNWhadb DefaultHistoryPath /export/home/log/thava/var1/opt/SUNWhadb ./hadbm-i getmbean HADB:type=auth Attribute Value AuthInfo Subject: Principal: ConnectionPrincipal(jmxmp://localhost:59337 26474210) Principal: ClientAgentProtocolPrincipal(3) Principal: ClientBundleProtocolPrincipal(3) Principal: ReleasePrincipal(4.5.0.4) : [ConnectionPrincipal(jmxmp://localhost:59337 26474210), ClientAgentProtocolPrincipal(3), ClientBundleProtocolPrincipal(3), ReleasePrincipal(4.5.0.4)] ------ You can set attribute in mbean by : ./hadbm-i setmbean ConsoleHandlerLevel=INFO[,attr=val*] HADB:type=Logging ------ See /javasrc/mgt/client/com/sun/hadb/cli/help/hadbm-i.txt for more info. Note: what is the syntax for invoke mbean ? hadbm-i invokembean [*] e.g. hadbm-i invokembean HADB:type=InterMA checkFileExists dummyFile Operation checkFileExists successfully invoked. Return value is: [void]. OR hadbm-i:Error 22022: The path dummyFile does not exist on host localhost. --- hadbm-i getcfgvar [] Get HADB cfg variables for a node in the database. hadbm-i setcfgvar [] hadbm-i recoverhost --hosts= recover the hosts in if they have lost their repository. ---- After database creation, these are the mbeans active: ./hadbm-i browsembeanserver Class ObjectName javax.management.MBeanServerDelegate JMImplementation:type=MBeanServerDelegate com.sun.hadb.mgt.rema.Node HADB:type=Node,dbname=hadb,physno=0 com.sun.hadb.mgt.agent.MgtUser HADB:type=MgtUser,username=admin,id=USR_admin com.sun.hadb.mgt.rema.OperationMonitor HADB:type=OperationMonitor,dbname=hadb,desired=2 com.sun.hadb.mgt.rema.DataDevice HADB:type=DataDevice,dbname=hadb,devno=0,physno=1 com.sun.hadb.mgt.rema.DRU HADB:type=DRU,dbname=hadb,druno=1 com.sun.hadb.mgt.agent.HostInterface HADB:type=HostInterface,hostname=localhost,address=10.12.162.132,id=HI_localhost_10_12_162_132 com.sun.hadb.mgt.rema.NomanDevice HADB:type=NomanDevice,dbname=hadb,physno=1 com.sun.hadb.mgt.agent.HADBManagement HADB:type=HADBManagement com.sun.hadb.mgt.rema.Node HADB:type=Node,dbname=hadb,physno=1 com.sun.hadb.mgt.rema.RelalgDevice HADB:type=RelalgDevice,dbname=hadb,physno=1 com.sun.hadb.mgt.rema.Database HADB:type=Database,dbname=hadb com.sun.hadb.mgt.agent.HostInterface HADB:type=HostInterface,hostname=localhost,address=127.0.0.1,id=HI_localhost_127_0_0_1 com.sun.hadb.mgt.rema.NiLogDevice HADB:type=NiLogDevice,dbname=hadb,physno=0 com.sun.hadb.mgt.agent.Platform HADB:type=Platform com.sun.hadb.mgt.rema.SoftwareBundle HADB:type=SoftwareBundle,name=V4.5.0.4 com.sun.hadb.mgt.agent.Logging HADB:type=Logging com.sun.hadb.mgt.rema.NomanDevice HADB:type=NomanDevice,dbname=hadb,physno=0 com.sun.hadb.mgt.rema.NiLogDevice HADB:type=NiLogDevice,dbname=hadb,physno=1 com.sun.hadb.mgt.rema.DRU HADB:type=DRU,dbname=hadb,druno=0 com.sun.hadb.mgt.agent.InterMA HADB:type=InterMA com.sun.hadb.mgt.agent.Auth HADB:type=auth com.sun.hadb.mgt.agent.Readiness HADB:type=readiness com.sun.hadb.mgt.agent.Host HADB:type=Host,hostname=localhost,id=H_localhost com.sun.hadb.mgt.rema.DataDevice HADB:type=DataDevice,dbname=hadb,devno=0,physno=0 com.sun.hadb.mgt.rema.RelalgDevice HADB:type=RelalgDevice,dbname=hadb,physno=0 # The main starting point is to do: hadbm-i getmbean for the database object: ./hadbm-i getmbean HADB:type=Database,dbname=hadb Attribute Value Name hadb State FaultTolerant .... .... MsgKey 994675845 ClientKey 1481894899 DRU0 HADB:type=DRU,dbname=hadb,druno=0 DRU1 HADB:type=DRU,dbname=hadb,druno=1 Nodes[0] HADB:type=Node,dbname=hadb,physno=0 Nodes[1] HADB:type=Node,dbname=hadb,physno=1 TConServerList 127.0.0.1:15000,127.0.0.1:15020 OperationMonitor HADB:type=OperationMonitor,dbname=hadb,desired=2 SoftwareBundle HADB:type=SoftwareBundle,name=V4.5.0.4 Portbase 15000 SqlServerList 127.0.0.1:15005,127.0.0.1:15025 ..... --------- ./hadbm-i getmbean HADB:type=SoftwareBundle,name=V4.5.0.4 Attribute Value Hosts[0] HADB:type=Host,hostname=localhost,id=H_localhost Description Default bundle for managing HADB version 4.5.0.4 AdminProtocol 2 Name V4.5.0.4 Path /home/thava/hadb/src/clustra/install/solaris2.10-SUN57_CC-SUN_C --------- After database creation: ./hadbm-i getmbean HADB:type=InterMA Attribute Value ReleaseInfo 4.5.0.4 (solaris2.10-SUN57_CC-SUN_C) DomainKey 52dd5373_10bb88062f7_-8000 Ready true Virgin false DefaultDevicePath /export/home/log/thava/var1/opt/SUNWhadb DefaultHistoryPath /export/home/log/thava/var1/opt/SUNWhadb RepositoryChecksum SHA-1:ef17d1f12edb279d9416bc8d260b3bd97bd45425:6 ------- Don't get lost in the details: MA management system arch : Too much abstraction often hurts. Very difficult to optimize the code or find out performance bottleneck. Don't try to understand everything in arch document. Just take couple of use cases and run through the entire operation. InterMA Jgroups communication categorizes msgs to 3 channles: - operation state, db state, repository MA interacts with node manager directly to get db status/control etc. Unintuitive interface is difficult for others to understand. but it is easy for the one who writes it. If the interface is difficult, just follow the use case method provided by the creator. Setting each parameter requires node restart -- which is unnecessary. Node State and Node Role : Node State: starting, repairing, recovering, waiting, running, backingup, restoring, stopping, halting, stopped, unknown Node Role: active, spare, offline, shutdown, booting??, other, unknown Definition: Node role identifies the procedure to be done when node is started/stopped or becomes unavailable. Booting is actually a state though! e.g. : spare node can be running/stopped. e.g. : active node can be running/stopped/recovering why booting is node role? A booting node can be running/stopped??? Db states: haFaultTolerant(running with spares), FaultTolerant(running no spares), Operational(atleast 1 out of 2 runing), nonOperational, stopped. NodeStartLevel: undefstart - undefined start level, auto - system determines it. mem - start with mem recovery, else go to disk disk - start with disk recovery, else try repair repair - recover from mirror node. firststart - first time starting nomanstart - start after controlled stop. E.g. - noman can instruct the node directly to try repairing. The default auto may map to mem or disk. Exceptions thrown from JMX communication should be serializable so that info can be passed from one end to the other. Repository listener thread invokes node reconfigurator thread as and when required. During database reconfiguration, multiple configurations are stored in the repository, the latest being the "desired configuration". The obsolete config will be deleted only after reconfig is complete. MA runs as many node monitoring threads as many nodes (including remote) The node monitoring thread usually asks the local nodeman server regarding the status of the node (whether it is local or remote). On special cases, it may ask other noman servers or local nsup. Repository supports only one open transaction at a time. consists of following components: - DistributedRepository - DrTxHandler - Transaction Handler (message driven implementation) - DrStore - Data Store. Transaction handlers in different hosts sends msgs among themselves and to the self to drive itself. domainkey uniquely identifies the domain on the same Jgroups port. New nodeset (2b): [+DICT/2,0] [...] -- this means ... "UP dictionary-service on node 2 at portoffset 0" NodesetNumber=2b : The msg will be rejected by this node if the nodeset number this recognizes is newer or older than this number. Node start levels = 0:Ask Nsup what to do (auto); 3=repair ------------------------------------------------------------------ NMStateBroadcaster class is used by MA to send cluma broadcast to noman servers : 1) by rema.Database.getState() by sending getSystemStateC 2) sends refreshNomanStateC to force reread node role changes. The noman rereads table 105. (before reconfig or after reconfig?) Note: select * from sysroot.alltables; says table 105 is krnodes; SQL: select * from krnnodes; nodeid logicalid dunitid nodename status 0 0 0 node-0 active 1 1 1 node-1 active 2 2 0 node-2 active 3 3 1 node-3 active 4 4 0 node-4 spare 5 5 1 node-5 spare ------------------------------------------------------------------ Database State JChannel : used to broadcast local nodes states. There are around 48 services available : illegalService = new ServiceId(0x00); // For safety "(void)" noService = new ServiceId(0x01); // "(NO SERV)" nsupService = new ServiceId(0x02); // "NSUP" pKernService = new ServiceId(0x03); // "P-KERN" primary kernel hsKernService = new ServiceId(0x04); // "HS-KERN" hotstandby kernel recKernService = new ServiceId(0x05);// "REC-KERN" recover kernel service // main mem or disk recovery only. birthKernService = new ServiceId(0x06);// "BTH-KERN" Node came by intial // genesis startup or repair (no recovery) pTconService = new ServiceId(0x07); // P-TCON primary tcon hsTconService = new ServiceId(0x08); // HS-TCON hot standby uchnService = (0x09); // UCHN process can ship logs to/from primary // this service no longer needed to be represented??? dictService = (0x0a); // DICT system tables id<200 are available and loaded // loaded to sharedmem &service provided by TCON. // Provides global info like nodes, replicas, tables, etc. birthDictService = (0x0b); // DICT cache is booting. used in genesis start. repmService = (0x0c);// REPM node able to repair. See ACT-REPM(repair active) // service provided by spare node TCON or KERN ? haltService = new ServiceId(0x0d);// "HALT" when nsup of anynode announces // this service, all nodes will stop snmpService = (0x0e); // SNMP clu_snmp_srv obsolete snmpTconService = (0x0f); // SNMP-TC obsolete snmpKernService = (0x10); // SNMP-KR obsolete net0Service = (0x11); // NET0 network0 is active.. used??? by NSUP net1Service = (0x12); // NET1 network1 is active by NSUP net2Service = (0x13); // NET2 network2 is active by NSUP net3Service = (0x14); // NET3 network3 is active by NSUP backSlaveService = (0x15); // BCK-SLA backup slave. controlled by BCK-TC // service provided by backup server backManService = (0x16); // BCK-TC backup transaction controller. // service provided by TCON restSlaveService = (0x17); // REST-SLA restore slave. provided by backup srv restManService = (0x18); // REST-TC restore transaction controller. // service provided by TCON actBackSlaveService = (0x19); // ACT-BSLA BACKUP active(backing up now) actRestSlaveService = (0x1a); // ACT-RSLA RESTORE active(restoring now) startService = (0x1b); // START Node can answer requests about startlevel // if the node has been started by autostart. // e.g. mem/disk/repair/genesis/noman start. // Only Service provided by more than 1 proc: NSUP & TCON activeRepmService = (0x1c); // ACT-REPM repair on progress! // either self-repair on spare-repair on progress ! // provided by TCON or KERN? offlineService = (0x1d); // noderole is offline and nsup is running. by NSUP // participates in VP protocol. no other serivice available. activeBackupService = (0x1e); // ACT-BACK bacup is active. provided by TCON // Note: compare this with ACT-BSLA - provided by backup server. activeRestoreService = (0x1f); // ACT-REST restore active. provided by TCON // ACT-RSLA active restore slave (by backup srv) does the real backup. arReceiverService = (0x20); // AR-RCV async replication receiver. obsolete. // used to be provided by clu_reprcv_serv arSenderService = (0x21); // AR-SND. async replication sender. obsolete. statdictService = (0x22); // STATDICT - process is ready to answer requests // about statistics. by KERN dummyService = (0x23); // DUMMY Dummy (test) service sqlcService = (0x24); // SQLC SQL compiler. ready to recevie sql conn. // provided by clu_sql_serv (now clu_controller) sqlxService = (0x25); // SQLX SQL executor. SQL relalg server ready. birthSqlxService = (0x26); // BTH-SQLX (relalg server comingup) sqldictService = (0x27); // SQLDICT SQL dict tables are ready. by TCON // later SQLSHM service announces that clu_sqlshm_serv loaded this info // (from TCON) into sharedmem. This info contains only SQL specific // tables, columns, view etc info. (No info about nodes, fragments, etc) // This is meant to be shared between all clu_sql* processes. sqlshmService = (0x28); // SQLSHM - SQL dictionary loaded in shared mem. // provided by clu_sqlshm_serv (now clu_controller) ver1Service = (0x29); // VER5 - NSUP Protocol version 5 service ver2Service = (0x2a); // VER6 - NSUP Protocol version 6 service ver3Service = (0x2b); // VER7 - NSUP Protocol version 7 service ver4Service = (0x2c); // VER4 - NSUP Protocol version 4 service oldverService = (0x2d); // OLDVER - NSUP Old protocol version service nomanRunService = (0x2e); // NOM-R - NOMAN running. nomanStateService = (0x2f); // NOM-S - NOMAN has state. obsolete ? ------------------------------------------------------------------------- private String[] serviceText = new String[] { "(void)", "NO SRV", "NSUP", "P-KERN", "HS-KERN", /* 0x01 .. */ "REC-KRN", "BTH-KRN", "P-TCON", "HS-TCON", /* 0x05 .. */ "UCHN", "DICT", "BTH-DIC", "REPM", /* 0x09 .. */ "HALT", "SNMP", "SNMP-TC", "SNMP-KR", /* 0x0d .. */ "NET0", "NET1", "NET2", "NET3", /* 0x11 .. */ "BCK-SLA", "BCK-TC", "REST-SLA", "REST-TC", /* 0x15 .. */ "ACT-BSLA", "ACT-RSLA", "START", "ACT-REPM", /* 0x19 .. */ "OFFLINE", "ACT-BACK", "ACT-REST", "AR-RCV", /* 0x1d .. */ "AR-SND", "STATDICT", "DUMMY", "SQLC", /* 0x21 .. */ "SQLX", "BTH-SQLX", "SQLDICT", "SQLSHM", /* 0x25 .. */ "VER5", "VER6", "VER7", "VER4", /* 0x29 .. */ "OLDVER", "NOM-R", "NOM-S" /* 0x2d .. */ }; } Shared mem segments: - SQL Dictionary cache : Info about SQL only data: like table, column, views No global data like fragments, nodes, etc. created by: TCON loaded by : sqlshm shared by : transaction server and all sql servers is this persistent? - Distribution Dict cache: Info about nodes, fragments, tables, columns etc. includes global info (and also most of sql info?) created by : TCON and/or KERN? shared by: TCON and KERN and *all* clu_* servers! - Database Buffer - used by KERN for data pages on disk. - Tuple Log Buffer - log records. created/maintained by TCON/KERN services. - Node internal log buffer - logs node internal B-TREE operations by KERN ----------------------------------------------------------------------------- Comsys communication System: - built on UDP/IP - used for client-server, server-server, local or across nodes communication - built on msg passing protocol that supports retransmission/double network supports 3 connection oriented communication mechanism: dialog - msg based request/reply mechanism stream - streaming large amount of data (for relalg server) RPC channel - for communication between one client and multiple servers. provides session auth, versioning of server procedures, etc. --------------------------------------------------------------------------- Thread management and synchronization : -Thread management - "etx" library -Thread synchronization - "syn" library -Threads msg passing - "que" library --------------------------------------------------------------------------- pmanager - Process manager: - pmanager handles communication between NSUP and other servers. - pmanager of NSUP sends periodic keep-alive; pmanager of servers must reply - also handles node-set changes. NSUP sends (new-nodeset, old-nodeset) to all local servers. The server pmanager calls process specific ::nodeSetChange() function to take action. ----------------------------------------------------------------------------- Using cluma/cladm, you can selectively turn on/off debug for server processes. ----------------------------------------------------------------------------- Dynamic Storage: ----------------------------------------------------------------------------- Nodeset: - set of services across all nodes - last nodeset is known/maintained by all NSUPs. (note: Even noman may not be running in an "offline role" node). - Virtual Partitioning protocol: A selected coordinator receives service sets from all nodes then distributes it. - NSUP I-am-alive protocol detects NSUP failures and activates VP protocol. - PSUP I-am-alive protocol detects failures in child processes. NSUP restarts processes and initiates VP protocol if needed. ----------------------------------------------------------------------------- Node State : It is a deduced info from [node type, node role, services available] It is not used for anything other than display purpose for hadbm status. -------------------------------------------------------------------- The individual server processes read table 105 to decide it's role and logical number at the startup ? -------------------------------------------------------------------- How does an active process abort ongoing spare-takeover transaction ? -------------------------------------------------------------------- How to identify spare node ? should have 'spare role' and 'DICT REPM and not ACT-REPM' services. Note: self repairing node has same services. -------------------------------------------------------------------- Await Repair state: ScanNodeDictThread keeps scanning table 105 (once 30secs). Initiates spare node repair if one of the following is true: - If the role is ACTIVE and node has no NSUP. (stopped) - If the role is shutdown and node has offline service. -------------------------------------------------------------------- Booting State: - A genesis start of whole database has no data dictionary available so it needs utility programs to bootstrap the process. - genesis start of single node -- the data dictionary is available at some other node (i.e. non logical number 0 or 1)?? Thus booting node can read data dictionary in the normal way by establishing session/transaction and issue data read operations. -------------------------------------------------------------------- Device Config looks like this: #V=1 N=DataDeviceConfig_-56aedbbd_112f419fd5c_-7f70 ##META className=com.sun.hadb.mgt.rema.DataDeviceConfig ##META factoryClassLocation= ##META factoryClassName=com.sun.hadb.mgt.agent.BundleObjectFactory # id=DataDeviceConfig_-56aedbbd_112f419fd5c_-7f70 bundle=SoftwareBundle_517998f8_112f41a2aee_-7ffc bundleName=V4.6.1.1 path=/export/home/log/thava/var1/ nodeconf=NodeConfig_-56aedbbd_112f419fd5c_-7f7a mindevinitnum=1 blocksize=16384 numblocks=4096 datadevicenumber=0 -------------------------------------------------------------------- How to configure HADB data device sizes and data/log buffer sizes ? From hadbm help create output: Minimum datadevicesize is = (((4 * LogBufferSize) / NumberOfDataDevices) + 16) MB. This means the datadevice should be able to keep 4 times the logbuffer (logbuffer+internal logbuffer??). It won't have any space for data though. By default, LogBufferSize is 48 MB and there is 1 data data device, resulting in a minimum requirement of 208 MB. The maximum size is 262144 MB. Note: The datadevice size does not depend on databuffer pool size. 1 Block = 16 KB ; Total number of blocks = 4096; Device Size = 16MB; See HADB Server FAQ : https://clustra.norway.sun.com/twiki/bin/view/HADB/HadbUserDocumentation https://clustra.norway.sun.com/twiki/bin/view/HADB/FAQHADBServers The --minimumsize does the following settings : DataBufferPoolSize=16777216 (16MB) [ Default=200MB ] InternalLogbufferSize=4194304 (4MB) [ Default=12MB ] LogbufferSize=4194304 (4MB) [ Default=48MB ] RelalgDeviceSize=33554432 (32MB) [ Default=128MB ] DataDeviceSize=64MB (64MB) [ Default=1024MB ] Per Node memory requirement = 24MB; Disk size requirement = 64+32 MB. Explicitly create minimum size database : hadbm create --hosts host1,host2 --devicesize 64 --set "InternalLogbufferSize=4 LogbufferSize=4 RelalgDeviceSize=32 DataBufferPoolSize=16" mydb Node-0.DataDevice-0.DevicePath=/disk1" mydb -------------------------------------------------------------------------