Category Archives: Forum Netapp

How to manually send AutoSupport messages to NetApp

How to manually send AutoSupport messages to NetApp

 

KB Doc ID 1013073 Version: 15.0 Published date: 07/30/2015 Views: 72184

Description

One of NetApp’s AutoSupport best practices is to enable the storage system to automatically generate and transmit AutoSupport messages to NetApp using the integrated functionality in Data ONTAP. By enabling AutoSupport automatically, NetApp can provide proactive support through My AutoSupport, and reactive support through the NetApp Support Center. The My AutoSupport tool allows users to administer and support their NetApp storage systems with tools such as an AutoSupport data viewer, device visualizer, storage efficiency and health check reports, and Data ONTAP upgrade plan generator.

 

For a small number of users, it might not be possible for AutoSupport messages to be sent automatically due to network connectivity limitations. However, the AutoSupport data might be required for Technical Support to efficiently troubleshoot a system failure. In these cases, the AutoSupport data will have to be collected manually. In Data ONTAP 7.x and earlier, a majority of the AutoSupport data was formatted as plain text. However, in Data ONTAP 8.0 and later, AutoSupport data is increasingly being generated in the XML format, which requires an XML viewer to read the data. The switch to XML was done to minimize the size of AutoSupport messages. Additionally, in Data ONTAP 8.0.1 and later, the attachments are encoded in a MIME format when stored in the storage system’s /etc/log/autosupport directory.

Some support engineers might find it advantageous for the user to manually send AutoSupport data in a way that would allow the NetApp systems such as My AutoSupport or the NetApp SmartSolve Tools Portal to display this data. This article contains the steps to provide AutoSupport as a file upload or in a manner that enables the data to be visible in My AutoSupport.

For more information on enabling AutoSupport, see the AutoSupport section on the NetApp Support site.

Continue reading

Netboot

How to Network Boot (Netboot) a LOADER / CFE based filer in Data ONTAP 7G

 

KB ID: 1012003 Version: 13.0 Published date: 03/02/2015 Views: 14837

Description

Currently, netbooting a kernel image over the network by the TFTP and HTTP protocol is supported by platforms using LOADER (FAS2000 series, FAS3040/FAS3070, FAS3200, FAS6000 and FAS8000 series) or Common Firmware Environment (FAS200 series and FAS3020/FAS3050). To configure an existing controller to provide the TFTP and HTTP services for netbooting, the steps are described below:

Procedure

For setting up TFTP services for netbooting:

  1. Log in to a controller that is up and running Data ONTAP.
  2. On the controller, type the following:
    filer> options tftpd.enable on
  3. Create the /etc/tftpboot directory if one does not exist. This is the default root directory where the netboot kernel, for example netapp-mips, should be placed. Check with ‘options tftpd.rootdir‘ whether the tftp root dir reflects the directory above.
  4. Place the netboot kernel image into this directory (/etc/tftpboot or whatever the tftp root dir is set  to)
    Note: Obtain the file from the Data ONTAP software download (http://mysupport.netapp.com/NOW/cgi-bin/software) page (obtain the one for the right platform)

Setting up HTTP services for netbooting:

  1. Log in to a controller that is running Data ONTAP.
  2. Place the netboot kernel, for example netapp-mips, into the /etc/http directory. By default, this directory serves HTTP requests to service management (ZAPI) requests to the controller.

    For TFTP netbooting using the default rootdir:
    LOADER> netboot tftp://Filer/netapp-mips

    For HTTP netbooting using the /etc/http directory, type:
    LOADER> netboot http://partner_ip/na_admin/722_netboot.e (http://partner_ip/na_admin/722_netboot.e)

LOADER based filer
FAS2020 / FAS2050 / FAS3040 / FAS3070 / FAS3140/ FAS3170/ FAS3200 /FAS6030 / FAS6040 / FAS6070 / FAS6080/ FAS8000

To netboot a LOADER-based filer, substitute your IP addresses and perform the following at the LOADER> prompt:

LOADER> ifconfig e0a -addr=192.168.1.10 -mask=255.255.255.0 -gw=192.168.1.1 -dns=192.168.1.2
e0a: Link speed: 1000BaseT FDX
Device e0a:  hwaddr 00-A0-98-03-48-AB, ipaddr 192.168.1.10, mask 255.255.255.0
gateway 192.168.1.1, nameserver not set
If gateway is available, test connectivity:
LOADER> ping 192.168.1.1
192.168.1.1 (192.168.1.1) is alive
192.168.1.1 (192.168.1.1): 1 packets sent, 1 received
Execute netboot
:///  command:
LOADER> netboot tftp://192.168.1.11/tftproot/netapp_7.2.3_x86_64
Loading:...........

Type the following at the LOADER> prompt to display the complete syntax: help ifconfig

Notes:

  • Only built-in Ethernet ports are supported for netbooting.a
  • This procedure is not persistent across reboots.

Common Firmware Environment (CFE) based filer
FAS250 / FAS270 / FAS3020 / FAS3050

To netboot a FAS250, see the section entitled ‘Netbooting an FAS250’ found in the FAS250 Storage Appliance Hardware and Service Guide.

CFE supports booting from a CompactFlash device or from the network using TFTP or HTTP protocols. For a brand new CompactFlash Card, there is no Data ONTAP image, therefore, it must boot from network (netboot). The netboot network configuration does not persist while rebooting.

Important Note:
It might take some time before the network connection is established. To check the status, use the ifconfig command to check the link status. Try a ping to the http server first, since loading the kernel image takes a long time to abort.

Provide all the network parameters, that is gateway and DNS, even if they are not needed, it helps to establish the connection faster.

Related Links:

Note: Use the netboot file when downloading and installing Data ONTAP.

 

Disclaimer

NetApp provides no representations or warranties regarding the accuracy, reliability, or serviceability of any information or recommendations provided in this publication, or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS, and the use of this information or the implementation of any recommendations or techniques herein is a customer’s responsibility and depends on the customer’s ability to evaluate and integrate them into the customer’s operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.

Cluster mode to 7 mode

Maybe this will help.

The Controller is now set to boot c-mode. You need to change this at the LOADER prompt.

Procedure:

– halt controller

– at the boot loader, set the environment variable bootarg.init.boot_clustered to false

LOADER> setenv bootarg.init.boot_clustered false

– Install 7-mode

– boot into the boot menu and select 4 for the initial installation

How to move an aggregate between software disk-owned HA pairs.

How to move an aggregate between software disk-owned HA pairs.

 

KB ID: 1011651 Version: 7.0 Published date: 08/07/2014 Views: 3206

Description

This article explains, how move an aggregate from one controller in a HA pair to its partner in a software disk owned system. This procedure only applies to 7-mode systems.

Procedure

For reference: FILER1 owns the disk initially. FILER2 is where the disks/volume are being moved to.

WARNING:
Before starting this procedure, confirm that there are no aggregates or FlexVols on FILER2 that have the same name as the original aggregate/FlexVols or traditional volume being moved. Failure to do so will result in the relocated volume(s) with conflicting names being appended with a (1) instead of the original name.

Note:
This procedure is only supported under the following conditions:

  • Disks do not get moved, ownership remains of either node in HA pair.
  • If disks are moved outside the HA pair, shelf MUST NOT be moved.

It could be misunderstood to move disk/shelf to another filer using similar procedure. If the shelf needs to be moved outside of the HA pair, downtime is required.

    1. Start by taking the aggregate offline from FILER1.

Example:
For a traditional volume:

FILER1>aggr offline <volname>

      For an aggregate with FlexVols:

FILER1>priv set diag
FILER1*>aggr offline <volname> -a
FILER1*>priv set admin
FILER1>

    1. Move disk ownership of all disks in the aggr to FILER2.

WARNING:
The node on which the disk assign is done must be the node that is giving away the aggregate.

   FILER1> disk assign 0a.16 0a.17 0a.18 0a.19 -o FILER2 -f

(-f must be used since the disks are already owned)

    1. Verify that no further disks belonging to the original aggregate are left on the original node.

FILER1>aggr status -r <original-volname>

    1. Online the relocated aggregate from FILER2:

FILER2> aggr online <test>

Note:
It will be necessary to reconfigure any Common Internet File System Protocol (CIFS) shares,Network File System (NFS) exports or configure the appropriate igroups on the partner for the relocated volume(s) before clients can access this data.

Related Link:

 

Disclaimer

NetApp provides no representations or warranties regarding the accuracy, reliability, or serviceability of any information or recommendations provided in this publication, or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS, and the use of this information or the implementation of any recommendations or techniques herein is a customer’s responsibility and depends on the customer’s ability to evaluate and integrate them into the customer’s operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.

Configuring CNA ports and FC ports

Configuring CNA ports

If a node has onboard CNA ports or a CNA card, you must check the configuration of the ports and possibly reconfigure them, depending on how you want to use the upgraded system.

Before you begin

You must have the correct SFP+ modules for the CNA ports.

About this task

CNA ports can be configured into native Fibre Channel (FC) mode or CNA mode. FC mode supports FC initiator and FC target; CNA mode allows concurrent NIC and FCoE traffic the same 10GbE SFP+ interface and supports FC target. Continue reading

NetApp – Fix the “Bad Label” issue

NetApp – Fix the “Bad Label” issue

Recently I came across with a Bad Label error during the change of failed disk. My company changed the support company for one of our systems and the new one sent disks to replace the failed ones from the system. Normally the DC tech make a swap and assign the disks to the system, but this time he called me with an issue (from /etc/messages):

Thu May 22 13:02:54 CEST [NETAPP: raid.config.disk.bad.label:error]: Disk 9.10 Shelf 6 Bay 9 [NETAPP X291_S15KXXX0F15 NA01] S/N [3QQ312Y2XXXPBW] has bad label.
Thu May 22 13:02:54 CEST [NETAPP: raid.config.disk.bad.label:error]: Disk 6.70 Shelf 4 Bay 7 [NETAPP X291_S15KXXX0F15 NA01] S/N [3QQ3097KXXX5VU] has bad label.

 

To fix the issue I did:

NETAPP> priv set advanced
Warning: These advanced commands are potentially dangerous; use
them only when directed to do so by NetApp
personnel.
NETAPP*> vol status -f

Broken disks

RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
bad label 6.70 0d 4 7 FC:A 1 FCAL 15000 418000/856064000 420156/860480768
bad label 9.10 0d 6 9 FC:A 1 FCAL 15000 418000/856064000 420156/860480768
NETAPP*> disk unfail -s 6.70
disk unfail: unfailing disk 6.70...
NETAPP*> Fri May 23 08:42:47 CEST [NETAPP: raid.disk.unfail.done:info]: Disk 6.70 Shelf 4 Bay 7 [NETAPP X291_S15XXX0F15 NA01] S/N [3QQ3097KXXX5VU] unfailed, and is now a spare

NETAPP*> disk unfail -s 9.10
disk unfail: unfailing disk 9.10...
NETAPP*> Fri May 23 08:43:04 CEST [NETAPP: raid.disk.unfail.done:info]: Disk 9.10 Shelf 6 Bay 9 [NETAPP X291_S15XXX0F15 NA01] S/N [3QQ312Y2XXXPBW] unfailed, and is now a spare

NETAPP*> vol status -f

Broken disks (empty)
NETAPP*> vol status -s

Pool1 spare disks

RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
Spare disks for block or zoned checksum traditional volumes or aggregates
spare 6.70 0d 4 7 FC:A 1 FCAL 15000 418000/856064000 420156/860480768 (not zeroed)
spare 9.10 0d 6 9 FC:A 1 FCAL 15000 418000/856064000 420156/860480768 (not zeroed)
NETAPP* > priv set
NETAPP> disk zero spares
NETAPP>

Handling Watchdog Resets

Back

Handling watchdog resets

KB ID: 3013539 Version: 8.0 Published date: 01/15/2015 Views: 8592

 

Answer

  1. What is a watchdog reset?

A watchdog is an independent timer that monitors the progress of the main controller running Data ONTAP. Its function is to serve as an automatic server restart in the event the system encounters an unrecoverable system error.

The watchdog implemented by NetApp uses a two-level timer with different actions associated with each level of time.

  • Level 1: Timeout: The storage appliance attempts to panic and dump the core in response to a non-maskable interrupt. Once a L1 watchdog is successfully issued, the system returns to service and a core file is written, allowing NetApp to determine the root cause of the hang. A L1 watchdog is issued if the timer is not reset within 1.5 seconds.
  • Level 2: Reset: The storage appliance resets through a hard reset signal sent from the timer. A L2 watchdog is issued if the watchdog timer is not reset within two seconds after the L1 watchdog.

It is not necessary to ‘recover’ from a watchdog timeout or watchdog reset, as both of these events are recovery mechanisms for other failures. The objective instead is to identify the failure(s) that caused the watchdog event.

  1. What is the appropriate response to a watchdog timeout (L1 Watchdog Event)?
    A watchdog timeout should be treated just like any other system panic. The associated backtrace and/or the core should be analyzed for the possible root cause(s). A giveback should be performed if necessary.
  2. What is the appropriate response to a watchdog reset (L2 Watchdog Event)?

If the storage appliance receives a single watchdog reset, in general, no action needs to be taken as the condition causing the watchdog reset most often is a transient issue and would have been cleared by the reset process. A giveback should be performed if necessary, and the appliance should be monitored for repeat occurrences.
If a storage appliance takes multiple watchdog resets, look for previously logged errors associated with the CPU, motherboard, memory or I/O cards.

  1. Data to be collected to help diagnose the cause of a watchdog reset:
  • AutoSupports
  • Console logs before, during, and after the watchdog event (if possible)
  • ssram log (/etc/log/ssram/ssram.log or /mroot/etc/log/ssram/ssram.log) – FAS62xx only
  • On systems with a service processor: – system sensors – events all – system log – sp status -d

Note: No hardware should be replaced unless the root cause is a hardware issue.

 

Disclaimer

NetApp provides no representations or warranties regarding the accuracy, reliability, or serviceability of any information or recommendations provided in this publication, or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS, and the use of this information or the implementation of any recommendations or techniques herein is a customer’s responsibility and depends on the customer’s ability to evaluate and integrate them into the customer’s operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.