SEMiSLUG Notes
14 May 1998
Question & Answer Sessions
- Wireless Palm Pilots, wireless Nokias, etc. Has anyone seen one? An
integrated thing? Digital? That works in the US? With service?
- Nope.
- Can you do MPP over SparcStation under SunOS?
- No one really knows.
- ISDN Dial-in on DEC Alpha, NT on the other side. Has anyone done this?
- This answer intentionally left blank.
- Sun ATM to unknown Cisco switch?
- Sun's ATM is less than stellar. Another OS might be a good idea.
- SCO Unix /tmp fills up, apparently spontaneously, the system reboots and comes
up in single user mode. Any idea what's up?
- Educated guess: cron job that rebuilds the locate database? It's
common for SCO to croak when / fills up. If set to reboot on a panic,
that would explain the situation. Make sure /tmp is on another
partition.
- Where is NT wasting all that time and resources? What is it doing?
- MSMFC4.DLL, perhaps? What do you expect from an OS pages it's disk cache?
Check out http://www.sysinternals.com/ for some possible tools.
NT fragments the drive horribly during install. The first thing you
should do is defrag.
"Defrag early and often."
- Favorite Unix palm pilot interfaces?
- There are a couple of Unix based hot-sync utilities. (Ask
Steve Arlow <yorick@yorick.com> for the URL.)
- PD driver for Artacom for running with Suns?
- Artacom wants $800 for the driver. That's about it.
- Any experience with multiresident AFS?
- Nope.
- What's the scoop with content.net? fissiontech.com?
- No one's heard of them.
- What's the deal with Chad's screwed up Netscape news server? Should
he upgrade it? Should he worry about the logs saying he's not accepting
input messages?
- Sounds like more of a personal decision. It could probably stand
to be moved to a new box.
- Can Ford live without news while Chad's doing the upgrade?
- Most likely not.
- Where's MJO?
- Beats me.
- What does Gabe do with all the notes he types up?
- Eventually, we'll get the SEMiSLUG web site up-to-date.
-
Presentation
How to manage a third of a million modems
ANS is now "Worldcom UUNet Services".
330,000 modems in service. Would like to add 10's of thousands of
modems a month. There are the 2nd level support group for the Big
Dial Operations (BD-Ops).
On-Call Operation:
Staff the NOC (9 a.m. to midnight)
One person on call 7x24x365
On-Site Operations
Schedules vary by individual
Staff on-site a a.m. to 10 p.m.
Real-Time monitoring
SNMP and other alerts -- If any of the displays go from green to
red, someone jumps on it. Usually, problems are fixed before
anyone can generate a trouble ticket.
Problem reports from customers.
Escalation from BD-Ops.
Retrospective Analysis and troubleshooting
ANS reports
Client (AOL) reports - hang-up probles and the like. It's bad to
have a modem tied up with no one using it. ANS keeps very accurate
stats on what goes wrong with modems.
i.e. 3 of 4 site in NYC had shorter than average connection times.
After months of analysis, they found that each site was connected
through different phone companies and where able to use that
info to get the 3 bad ones to clean-up their lines.
Real-Time Monitoring
Big Dial Alert Monitor
Equipment configuration
Equipment resets
Abnormal call procedssing
Pager Notifications
BD-Ops Escalations
Auto-busy On Troubled Hardware
Auto-detect and respond to ring-no-answer
Most problems are handled by the NOC operator or the On-call
body.
Diagnose, Attempt solution, Observe results, Iterate as needed, Escalate as
needed. And FEEDBACK!
Diagnose
Ad-hoc reports to provide general state
Analyze statistics for verious suspect equipment
Experiment as appropriate
Isolate problem with PSTN (public switched telephony network) or equipment
Hardware / configuration swaps
Proactive Feedback
Feedback to reporting staff
On usefulness of reports
Requests for new / modified reports
Feedback to tools staff
Modifications of existing tools
New proposed tools
Feedback to/from BD-Ops
New procedures for operators
A Big Dial Hub
A USR/3Com Control Hub is a chassis with slots for 17 cards plus 2 power
supply units (PSUs)
Each slot actually consists of 2 slots
Front slot which accepts a Network Access Card (NAC)
Front slot which accepts a Network Interface Card (NIC)
HDM (Hight Density Modem)
24 modems on a single card slot
Has a NAC and a NIC
NIC is an interface to a single T1
Front of card has 10 utilization lights instead of a light for each modem
Connects to packet bus, but not the TDM bus
Netserver ports go from S5-S100 (96 ports)
Network management Card (NMC)
Slot 17 of the chassis
SNMP proxy agent for all the other cards
Loss of NMC does not prevent call processing but does prevent management
of the hub.
Four hubs go in a rack, with an ethernet hub and a computer (manager
workstation). (Used to use RS/6000, but other machines are being used
now.) Up to four racks at a given installation.
The rack's configuration will be stored in a central database. The
information will come directly from the interagating the rack.
The Manager Workstation
General purpose Unix workstation
Functions of the manager
Collects and stores all information about incoming calls
Runs programs that monitor for potential problems and alerts us to
them
Provides a platform to run troubleshooting tools from ???
Provides out-of-band (OOB) access to most of the card on the hub
Authenticates users
The NMC sends SNMP traps to the local manager in response to specific
events on the hub, such asd a call starting or ending
So how do you make this work politically?
Collect and analyzes data for quality control, problem resolution, etc.
Make sure that the items you're being measured on get top priority.
Obligitory Plug: They're having trouble staffing. If you're a Unix
admin, you're probably overqualified. What they're looking for is
someone with a four-year degree, used to working with the public,
having problems getting a job in their field, and has some savvy
with the Net. Someone that can grow into the job. They train for
3 weeks, then 12 weeks on the job training. Call SCS.
Note: This isn't for work in Steve's area, but he has an interest
in getting those jobs filled. Some of his work is being held up
because it depends on work from other groups.
(Much applause from the gallery.)
Rumor & Inuendo