

Office of the Secretary of Defense National Aeronautics and Space Administration





## Long Term Data Retention of Flash Cells Used in Critical Applications





Keith Bergevin (DMEA) Rich Katz (NASA) David Flowers (DMEA)





July 8, 2015







richard.b.katz@nasa.gov keith.bergevin@dmea.osd.mil david.flowers@dmea.osd.mil



#### Collaborative NASA / DMEA Flash Memory Analysis



- NASA
  - NASA designs and develops systems for missions subjected to extremely rugged environments and very long operational time frames.
  - An extraordinarily premium is on reliability, as electronic components are used in mission-critical and safety-critical applications. Failures can result in loss of many years of engineering effort, unique scientific data, expensive spacecraft, and life.
  - NASA has been and is a leader in maintaining the expertise and equipment for testing and evaluating electronic devices.
- DMEA
  - Operates the only Department of Defense integrated circuit foundry
  - Designs, tests, and fabricates integrated circuits to meet stringent military specifications
  - DMEA is the leading DoD organization for understanding integrated circuit fabrication techniques
- NASA / DMEA
  - This collaboration applies the respective advancements and expertise of each organization to perform an analysis of Flash memory devices that are key components for space and defense
  - The end result is an in-depth study of flash memory devices and their suitability in high reliability and / safety critical applications











Electrons can tunnel at low bias if Traps line up at a spacing of 3 nm or less



Chart courtesy of Microsemi Corp.



# Hybrid Flash Test Techniques: Microscopy





SEM Image:10 µm Probe Pads FIB Isolated Flash Bit-Cell TEM Sample (Red) welded to Microprobe Sample is 20 nm thick.

SEM: Scanning Electron Microscope TEM: Transmission Electron Microscope FIB: Focused Ion Beam











#### • Microsemi methods of reducing SILC:

- Microsemi thickens the Tunnel Oxide to 10.3 nm versus 8.5 nm of typical memories to minimize the probability of enough traps lining up to cause leakage.
- Ramp rates and voltages are limited to minimize stress at the expense of write time.
- Write Cycles are limited to 500 rather than the 10,000 for memories.
- Cell is designed to eliminate high stress locations so electrons are passed uniformly across the Tunnel Oxide.
- Manufacturing tests use both checkerboard and checkerboard bar patterns in bake retention tests to eliminate weak tunnel oxides prior to shipment.
- A secondary objective of the NASA/DMEA test effort is validation of the effectiveness of manufacturer's mitigation techniques
  - Data retention lifetimes of Flash devices utilized in critical applications must be verified
  - NASA/DoD should not rely solely on manufacturer's claims independent validation is crucial!

\*courtesy of Microsemi Corp.



### Test Station for A3P250L FPGA



















A3P250L FPGA Average Erased V<sub>TH</sub>



# Erased Cell Data Retention at 150 °C Performance vs. Specification



|         | Spec         | 6,000 Hour Data | S/N            | Rate (mV/Year) | # Years | Delta (V) |
|---------|--------------|-----------------|----------------|----------------|---------|-----------|
| Гј (°С) | Life (years) | Life (years)    | CK002          | 51.7           | 2.2     | 0.114     |
| 70      | 102.7        | 306.8           | CK003          | 62.5           | 2.2     | 0.138     |
| 85      | 43.8         | 131.1           | RK002          | 79.2           | 2.2     | 0.174     |
| 100     | 20.0         | 60.0            | RK003          | 56.6           | 2.2     | 0.125     |
| 105     | 15.6         | 46.9            | <b>CT</b> 1000 |                |         | 0.011     |
| 110     | 12.3         | 36.9            | CK002          |                | 6.6     | 0.341     |
| 115     | 9.7          | 29.2            | CK003          |                | 6.6     | 0.413     |
| 120     | 7.7          | 23.2            | RK002          |                | 0.0     | 0.525     |
| 125     | 6.2          | 18.6            | KK003          |                | 0.0     | 0.374     |
| 130     | 5.0          | 15.0            |                |                |         |           |
| 135     | 4.0          | 12.1            |                |                |         |           |
| 140     | 3.3          | 9.8             |                |                |         |           |
| 145     | 2.7          | 8.0             |                |                |         |           |
| 150     | 2.2          | 6.6             |                |                |         |           |

Specification Data From RT ProASIC Data Sheet, Revision 5, September 2012. 6,000 Hour Data derived predictions courtesy of Microsemi Corporation.





# A3P250L FPGAs: Average Erased $V_{\rm TH}$ Distribution 242 Devices: Environmental Stressed and 4 Years of Storage



HAST: Highly Accelerated Stress Test





- Main focus Find and characterize the "weak" bit-cells!
  - Process variation creates array of bit-cells with variable "robustness"
  - Cells can be characterized as strong, nominal or weak
  - Weak cells define the actual data retention lifetime of the entire Flash array
- Hybrid testing: A multifaceted approach to characterize data retention
  - Microscopy and physical de-processing
  - Electrical characterization
  - Modeling and Simulation

Atomic Force Intensity Map of Flash Bit Cell Array



- Advantages of the Hybrid Approach:
  - Tied directly to the physical bit cells-not a statistical construct!
  - Accounts for physical phenomena that cannot be reliably measured (SILC, tunneling currents)
  - Rapid results: Not dependent on years of lifetime testing
    - Once trends from stress test data are identified and characterized; move to virtual environment
    - Simulations can be used to quickly create statistical database
  - Approach has been validated on several high- profile DoD weapons applications



# Hybrid Flash Test Techniques: Modeling and Simulation



- Why use simulation methods?
  - Can generate a large amount of virtual statistical data in a very short time
    - 10 year data retention lifetime can be accurately simulated in less than 24 hours!
  - Allows much more visibility into degradation mechanisms
    - Cannot easily measure quantities such as tunneling currents, SILC or trap distributions





- Types of simulations used by DMEA:
  - First-Principle physics using advanced numerical techniques to solve relevant partial differential equations that govern the phenomena-of-interest
  - Proprietary models allow the inclusion of many different physical phenomena
    - Impact ionization, Fowler-Nordheim tunneling, quantum effects ....many others.
  - Far more detail than typical engineering SPICE simulations
  - Multi-scale: simulations performed at carrier and device level
  - Data from simulations form the basis of predictive model



# Flash Data Retention Tests: Accelerated Lifetime



- Accelerated lifetime tests:
  - Apply temperatures (or voltages) to stress to device
  - Record time-to-failure and/or parametric shifts (e.g., change in threshold voltage,  $V_{TH}$ )
  - Extrapolate stressed lifetime back to operational environment using Arrhenius Equation
  - Favored by manufacturers due to ease of implementation
- Problems with accelerated lifetime testing:
  - Assumes types of defects activated are independent of temperature- not really the case!
  - Lacks resolution to account for some true retention degradation mechanisms
    - Stress-Induced Leakage Current (SILC)
    - Trap-Assisted-Tunneling (TAT)
    - Quantum tunneling
    - Program/Erase damage in dielectric thin film
  - Does not realistically address defect activation energy (E<sub>A</sub>)
- How do manufacturers account for such issues?
  - Arrhenius-based retention calculations typically result in 100+ year lifetimes
  - Manufacturers typically specify minimum 10 year retention time; some manufacturers specify retention times as a function of temperature.
  - Use Order-of –Magnitude margin
  - Cell design, stressing, and screening to eliminate or minimize certain classes of defects.



# Another Approach: Statistics, Black Boxes, and V<sub>CC</sub> Margin<sup>1</sup>



- Device Supply Voltage (V<sub>CC</sub>) Margin Tests:
  - Decrease device's supply voltage far below specification until an addressed bit "flips"
  - Utilize statistical analysis
  - Note: Threshold voltage margin testing for Flash cells, as described by JEDEC, is a valid test and is completely distinct from the "V<sub>CC</sub> Margin Test."
- Issues With Device Supply Voltage (V<sub>CC</sub>) Margin Testing of Flash Memory:
  - No physical basis for such tests; result is purely a statistical construct
    - Rationale of methodology inconsistent with accepted Flash cell theory of operation and engineering principles.
  - Assumes that the read voltage applied to the gate of the Flash cell is similar to V<sub>CC</sub>
    - Other then for first generation parts, this is frequently not true. Read voltage on later-generation parts developed by peripheral circuitry (charge pump, voltage boost).
    - Method does not utilize any "design for test" capability in the device.
  - Method can only attempt to test one of the two logic states (for dual level cell)
    - Lowering V<sub>CC</sub> cannot detect programmed bit cells (logic "0"); yet claim is that only logic "0"s are detected.
  - Results reported for charge loss are not credible.
  - Does not address the impact to Flash array peripheral circuitry
    - Voltage references, sense amplifiers, decoder logic, error correction all impacted by lower supply voltage
    - Results severely compromised due to peripheral circuitry not having infinite PSRR.
  - V<sub>CC</sub> margin testing is typically utilized as a gross methodology to evaluate processing quality
    - Much too coarse for utilization in data retention lifetime determination
    - Can be used to detect gross defects at the device or system level; not useful at the Flash cell level.
    - Is being misapplied for data retention time testing and characterization.

<sup>1</sup>See "JFTP Fuze – F-PLD Project, Understanding and Characterizing F-PLD Failure Modes in Fuzes," 57<sup>th</sup> Annual NDIA Fuze Conference, July 2014.





# Conclusion





# **Additional Material**



# Flash Attributes



- Flash Memory is:
  - High density
  - Low Cost
  - Non Volatile
  - Electrically Updateable
    - Read/Write by block, word, page
    - Block erasable- Data Reset to logical "1"
  - Pervasive!



NOR Flash Cell Cross-section

- Used in a wide variety of commercial and military applications

#### • Flash Bit-Cell Operation:

- NMOS transistor modified to include insulated "floating gate" below top (select) gate
- Data is stored in the form of electrons on the polysilicon floating gate
- Stored charge modulates threshold voltage (V<sub>TH</sub>): Voltage at which conduction occurs
- Programmed State: ("0") high  $V_{TH}$
- Erased State: ("1") low  $V_{TH}$
- The Problem:
  - Floating gate charge data storage is an extremely fragile data retention mechanism
  - Easily upset by extrinsic environmental factors (elevated temp, radiation, high E-fields)
  - What is the true reliability (data retention lifetime) of a device with Flash bit-cells?



# Hybrid Flash Test Techniques: Microscopy



- Scanning Electron Microscopy (SEM)
  - Useful for geometric characterization of bit-cells
  - Generated data used to build geometrically accurate virtual bit-cell (modeling)
  - Electron Dispersive X-Ray mode (EDX)-Used to characterize materials
  - SEM images used to characterize peripheral circuitry (reverse engineering)
- Focused Ion Beam (FIB)
  - Used to create cross-sectional views of individual bit cells for SEM analysis
  - Used to perform circuit edits for electrical characterization (bit-cell isolation)
- Transmission Electron Microscopy (TEM)
  - Used to characterize very small features such as dielectric film thickness
  - Sub-nm resolution is achievable with this technique
  - Electron Energy Loss Spectroscopy (EELs)- Chemical analysis of nm-scale samples
- Infrared Emission Microscopy (IREM)
  - Used to detect photon emission caused by current flow
  - Useful for mapping virtual memory location to physical location in Flash array



# Hybrid Flash Test Techniques: Temperature Stress Testing (1)





- Identify virtual location of weak bit-cells
  - Program Flash array with predesignated bit-cell values
  - Expose parts to elevated temperature to activate defects
  - Periodically measure threshold voltage ( $V_{TH}$ ) of bit-cells (both programed and erased)
  - Resultant data will be histogram illustrating statistical "spread" of  $V_{TH}$  values
  - "Outliers" can be identified from histograms-as can other trends indicative of weak cells
  - Once virtual address of weak bit cell is known, IREM used to map physical location



# Hybrid Flash Test Techniques: Temperature Stress Testing (2)





- Statistical analysis of V<sub>TH</sub> to monitor global array degradation
  - Above plot illustrates the effect of temperature stress note gradual change over time
  - Data such as this useful for characterizing the stress effect over millions of Flash cells
  - Gradual change indicates that applied stress is reasonable
    - Rapid change would indicate applied stress is too great
    - Becomes difficult to distinguish between applied stress and activated defects





A3P250L FPGA Average Programmed V<sub>TH</sub> 6,048 Hours @ 150 °C, June 1, 2015







A3P250L FPGAs: Average Programmed V<sub>TH</sub> Distribution 242 Devices: Environmental Stressed and 4 Years of Storage





# Hybrid Flash Test Techniques: Electrical Characterization



- Isolate and Characterize bit-cells
  - Measure parameters such as threshold voltage
  - Validate programming/erase waveforms and algorithms
- Isolate and characterize peripheral circuitry
  - Array elements such as charge pumps, sense amplifiers Det Vit-24207098-3
  - Operation of these elements influences accuracy of data retention testing
- Electrical characterization data
  - Used to calibrate model-curve fit virtual data to electrical data
  - Calibration parameters include capacitance, current and voltage measurements
  - This data binds simulation models to physical device







# Hybrid Flash Test Methodology: Summary



- The basic strategy:
  - Use temperature and voltage stress testing to identify trends such as global degradation and "weak" bit-cells (outliers)
    - These cells define the data retention time of the product
    - Testing over millions of bit-cells ensures that results are statistically relevant and "worst case" variation has been observed
  - Once weak cells are identified, utilize microscopy electrical characterization techniques to gather modeling/characterization data:
    - Microscopy yields geometric and material system data
    - Electrical characterization yields data on difficult-to –measure quantities: impurity concentrations, resistive implants
    - This data serves to bind the simulation algorithms to the physical device
  - After a calibrated simulation model has been developed, utilize simulation techniques to develop data retention lifetime database
    - Advantage-simulations can be completed much more rapidly than physical testing
    - Since model is calibrated to physical device data: very little loss in accuracy
  - Utilize simulation database to develop predictive model for data retention lifetime



# Hybrid Flash Test Methodology: Advantages



- Advantages
  - Methodology is tied to the physical device- Results are <u>not</u> a statistical construct!
    - All simulations utilize models calibrated to physical data
  - Methodology is independent of redundancy, wear leveling, and error correction
    - These can mask the true bit-cell failure rate and lifetime estimates
  - Methodology is easily adaptable to a wide range of non-volatile memory architectures
    - Split-gate Flash, EEPROM, Antifuse
  - Does not take years to obtain results:
    - Elevated temperature testing can take many years, depending on selected stress