美文网首页
分享一例Android内存碎片OOM

分享一例Android内存碎片OOM

作者: android小奉先 | 来源:发表于2022-09-08 10:51 被阅读0次

    问题介绍

    最近有遇到一种OOM问题


    image.png

    意思就是出现Java 内存碎片了,明明还有231MB空间可以使用,可是分配1.2M内存失败了。

    普通的分析思路

    这个问题如何分析?

    最直接的思路可能就是认为Java内存使用出问题了,最可能的就是OOM堆栈这块内存使用不合理了。然后找到关键字"碎片",既然是碎片,那么就搞一个类似于对象池的机制,就可以解决了。不过这儿会有3个疑问:

    1. 应用如何告诉sdk可以将对象回收到对象池里了?技术不难实现,可是在不修改接口前提下做到就比较复杂了
    2. 应用何时告诉sdk可以回收对象了?
    3. 这个对象池如何管理,允许管理多少对象?

    这样想想就会发现需要考虑的东西还很多,虽然可以通过虚引用的技术可以在对象被释放的时候被我们感知到,勉勉强强可以实现1,2的部分功能,可是如果要真正可以做到释放对象后让对象池托管还是比较难。

    到了这儿可能会觉得这个思路行不通了,确实是。而且还有最关键的一个问题,还有231M的内存,为什么就分配不出来了?即使是有碎片,我还有231M的内存,分配1.2M还是很容易啊。

    看起来就需要走正规的分析思路了。

    正规的分析思路

    分析问题如同中医的看病,走望闻问切的套路。
    首先是望,望就是看现象,在bugly看看发生oom时候的现场:
    [图片上传失败...(image-5f4472-1662605466284)]

    从该信息上,可以看出来的结论如下:

    1. DirectBuffer 分配内存失败了;
    2. 实际上虚拟机还有足够的空间
    3. 设备是32位的,继续观察发生OOM的机器,可以发现都是32位的

    接下来是闻,看看发生OOM时的场景等,比如内存使用,用户操作等,前后台等,这儿最关键的就是内存使用,看了内存后基本结论如下:

    1. OOM发生在运行了一段时间后,尤其是时间长 时候概率最高
    2. 应用占用内存在稳步上升,可以确定一定有内存泄漏问题

    接下来是问,咨询下客户是否可以重现,这块客户反馈测试重现不了,没有有效信息。
    最后就是切了,开始分析找rootcause了。
    从DirectBuffer 分配内存失败开始,最有效的方法就是从源码中找原因:

     public static ByteBuffer allocateDirect(int capacity) {
            // Android-changed: Android's DirectByteBuffers carry a MemoryRef.
            // return new DirectByteBuffer(capacity);
            DirectByteBuffer.MemoryRef memoryRef = new DirectByteBuffer.MemoryRef(capacity);
            return new DirectByteBuffer(capacity, memoryRef);
        }
    
     MemoryRef(int capacity) {
                VMRuntime runtime = VMRuntime.getRuntime();
                buffer = (byte[]) runtime.newNonMovableArray(byte.class, capacity + 7);
                allocatedAddress = runtime.addressOf(buffer);
                // Offset is set to handle the alignment: http://b/16449607
                offset = (int) (((allocatedAddress + 7) & ~(long) 7) - allocatedAddress);
                isAccessible = true;
                isFreed = false;
                originalBufferObject = null;
      }
    

    可以看到是在nonmoving 空间中分配内存了

    static jobject VMRuntime_newNonMovableArray(JNIEnv* env, jobject, jclass javaElementClass,
                                                jint length) {
      ScopedFastNativeObjectAccess soa(env);
      if (UNLIKELY(length < 0)) {
        ThrowNegativeArraySizeException(length);
        return nullptr;
      }
      ObjPtr<mirror::Class> element_class = soa.Decode<mirror::Class>(javaElementClass);
      if (UNLIKELY(element_class == nullptr)) {
        ThrowNullPointerException("element class == null");
        return nullptr;
      }
      Runtime* runtime = Runtime::Current();
      ObjPtr<mirror::Class> array_class =
          runtime->GetClassLinker()->FindArrayClass(soa.Self(), element_class);
      if (UNLIKELY(array_class == nullptr)) {
        return nullptr;
      }
      gc::AllocatorType allocator = runtime->GetHeap()->GetCurrentNonMovingAllocator();
      ObjPtr<mirror::Array> result = mirror::Array::Alloc(soa.Self(),
                                                          array_class,
                                                          length,
                                                          array_class->GetComponentSizeShift(),
                                                          allocator);
      return soa.AddLocalReference<jobject>(result);
    }
    

    这儿的allocator是kAllocatorTypeNonMoving,接下来就是虚拟机分配内存的流程了:

    template <bool kIsInstrumented, bool kFillUsable>
    inline ObjPtr<Array> Array::Alloc(Thread* self,
                                      ObjPtr<Class> array_class,
                                      int32_t component_count,
                                      size_t component_size_shift,
                                      gc::AllocatorType allocator_type) {
      DCHECK(allocator_type != gc::kAllocatorTypeLOS);
      DCHECK(array_class != nullptr);
      DCHECK(array_class->IsArrayClass());
      DCHECK_EQ(array_class->GetComponentSizeShift(), component_size_shift);
      DCHECK_EQ(array_class->GetComponentSize(), (1U << component_size_shift));
      size_t size = ComputeArraySize(component_count, component_size_shift);
    #ifdef __LP64__
      // 64-bit. No size_t overflow.
      DCHECK_NE(size, 0U);
    #else
      // 32-bit.
      if (UNLIKELY(size == 0)) {
        self->ThrowOutOfMemoryError(android::base::StringPrintf("%s of length %d would overflow",
                                                                array_class->PrettyDescriptor().c_str(),
                                                                component_count).c_str());
        return nullptr;
      }
    #endif
      gc::Heap* heap = Runtime::Current()->GetHeap();
      ObjPtr<Array> result;
      if (!kFillUsable) {
        SetLengthVisitor visitor(component_count);
        result = ObjPtr<Array>::DownCast(
            heap->AllocObjectWithAllocator<kIsInstrumented>(
                self, array_class, size, allocator_type, visitor));
      } else {
        SetLengthToUsableSizeVisitor visitor(component_count,
                                             DataOffset(1U << component_size_shift).SizeValue(),
                                             component_size_shift);
        result = ObjPtr<Array>::DownCast(
            heap->AllocObjectWithAllocator<kIsInstrumented>(
                self, array_class, size, allocator_type, visitor));
      }
      if (kIsDebugBuild &amp;&amp; result != nullptr &amp;&amp; Runtime::Current()->IsStarted()) {
        array_class = result->GetClass();  // In case the array class moved.
        CHECK_EQ(array_class->GetComponentSize(), 1U << component_size_shift);
        if (!kFillUsable) {
          CHECK_EQ(result->SizeOf(), size);
        } else {
          CHECK_GE(result->SizeOf(), size);
        }
      }
      return result;
    }
    

    可以看到最终还是走到了heap中的AllocObjectWithAllocator开始真正的分配, 流程也就是先分配,失败后就会gc,然后再分配,再gc + 扩容,然后再分配,如果还失败,有必要的话还会整理下内存,处理碎片,对于nonmoving就不会整理了,直接会抛异常。接下来我们就可以看到OOM中的内存碎片是如何来的了。
    看一下关键流程:

    mirror::Object* Heap::AllocateInternalWithGc(Thread* self,
                                                 AllocatorType allocator,
                                                 bool instrumented,
                                                 size_t alloc_size,
                                                 size_t* bytes_allocated,
                                                 size_t* usable_size,
                                                 size_t* bytes_tl_bulk_allocated,
                                                 ObjPtr<mirror::Class>* klass) {
     
      bool was_default_allocator = allocator == GetCurrentAllocator();
      // Make sure there is no pending exception since we may need to throw an OOME.
      self->AssertNoPendingException();
      DCHECK(klass != nullptr);
    
      StackHandleScope<1> hs(self);
      HandleWrapperObjPtr<mirror::Class> h_klass(hs.NewHandleWrapper(klass));
    
      auto send_object_pre_alloc =
          [&amp;]() REQUIRES_SHARED(Locks::mutator_lock_) REQUIRES(!Roles::uninterruptible_) {
            if (UNLIKELY(instrumented)) {
              AllocationListener* l = alloc_listener_.load(std::memory_order_seq_cst);
              if (UNLIKELY(l != nullptr) &amp;&amp; UNLIKELY(l->HasPreAlloc())) {
                l->PreObjectAllocated(self, h_klass, &amp;alloc_size);
              }
            }
          };
    #define PERFORM_SUSPENDING_OPERATION(op)                                          \
      [&amp;]() REQUIRES(Roles::uninterruptible_) REQUIRES_SHARED(Locks::mutator_lock_) { \
        ScopedAllowThreadSuspension ats;                                              \
        auto res = (op);                                                              \
        send_object_pre_alloc();                                                      \
        return res;                                                                   \
      }()
    
      // The allocation failed. If the GC is running, block until it completes, and then retry the
      // allocation.
      collector::GcType last_gc =
          PERFORM_SUSPENDING_OPERATION(WaitForGcToComplete(kGcCauseForAlloc, self));
      // If we were the default allocator but the allocator changed while we were suspended,
      // abort the allocation.
      if ((was_default_allocator &amp;&amp; allocator != GetCurrentAllocator()) ||
          (!instrumented &amp;&amp; EntrypointsInstrumented())) {
        return nullptr;
      }
      uint32_t starting_gc_num = GetCurrentGcNum();
      if (last_gc != collector::kGcTypeNone) {
        // A GC was in progress and we blocked, retry allocation now that memory has been freed.
        mirror::Object* ptr = TryToAllocate<true, false>(self, allocator, alloc_size, bytes_allocated,
                                                         usable_size, bytes_tl_bulk_allocated);
        if (ptr != nullptr) {
          return ptr;
        }
      }
    // 判断是否回收了足够的内存,如果剩余空间够,那么在分配失败时,还会继续扩容再分配,这儿就是搞明白上述一系列问题的关键
      auto have_reclaimed_enough = [&amp;]() {
        size_t curr_bytes_allocated = GetBytesAllocated();
        double curr_free_heap =
            static_cast<double>(growth_limit_ - curr_bytes_allocated) / growth_limit_;
        return curr_free_heap >= kMinFreeHeapAfterGcForAlloc;
      };
      // We perform one GC as per the next_gc_type_ (chosen in GrowForUtilization),
      // if it's not already tried. If that doesn't succeed then go for the most
      // exhaustive option. Perform a full-heap collection including clearing
      // SoftReferences. In case of ConcurrentCopying, it will also ensure that
      // all regions are evacuated. If allocation doesn't succeed even after that
      // then there is no hope, so we throw OOME.
      collector::GcType tried_type = next_gc_type_;
      if (last_gc < tried_type) {
        const bool gc_ran = PERFORM_SUSPENDING_OPERATION(
            CollectGarbageInternal(tried_type, kGcCauseForAlloc, false, starting_gc_num + 1)
            != collector::kGcTypeNone);
    
        if ((was_default_allocator &amp;&amp; allocator != GetCurrentAllocator()) ||
            (!instrumented &amp;&amp; EntrypointsInstrumented())) {
          return nullptr;
        }
        if (gc_ran &amp;&amp; have_reclaimed_enough()) {
          mirror::Object* ptr = TryToAllocate<true, false>(self, allocator,
                                                           alloc_size, bytes_allocated,
                                                           usable_size, bytes_tl_bulk_allocated);
          if (ptr != nullptr) {
            return ptr;
          }
        }
      }
      // Most allocations should have succeeded by now, so the heap is really full, really fragmented,
      // or the requested size is really big. Do another GC, collecting SoftReferences this time. The
      // VM spec requires that all SoftReferences have been collected and cleared before throwing
      // OOME.
      VLOG(gc) << "Forcing collection of SoftReferences for " << PrettySize(alloc_size)
               << " allocation";
      // TODO: Run finalization, but this may cause more allocations to occur.
      // We don't need a WaitForGcToComplete here either.
      // TODO: Should check whether another thread already just ran a GC with soft
      // references.
      DCHECK(!gc_plan_.empty());
      pre_oome_gc_count_.fetch_add(1, std::memory_order_relaxed);
      PERFORM_SUSPENDING_OPERATION(
          CollectGarbageInternal(gc_plan_.back(), kGcCauseForAlloc, true, GC_NUM_ANY));
      if ((was_default_allocator &amp;&amp; allocator != GetCurrentAllocator()) ||
          (!instrumented &amp;&amp; EntrypointsInstrumented())) {
        return nullptr;
      }
      mirror::Object* ptr = nullptr;
      // 这儿肯定扩容了,毕竟我们OOM时还剩下好几百M空间,可以判断出来是这儿返回的ptr一定是null,这样才会走到后面的ThrowOutOfMemoryError。
      if (have_reclaimed_enough()) {
        ptr = TryToAllocate<true, true>(self, allocator, alloc_size, bytes_allocated,
                                        usable_size, bytes_tl_bulk_allocated);
      }
    
      if (ptr == nullptr) {
        const uint64_t current_time = NanoTime();
        switch (allocator) {
          case kAllocatorTypeRosAlloc:
            // Fall-through.
          case kAllocatorTypeDlMalloc: {
            if (use_homogeneous_space_compaction_for_oom_ &amp;&amp;
                current_time - last_time_homogeneous_space_compaction_by_oom_ >
                min_interval_homogeneous_space_compaction_by_oom_) {
              last_time_homogeneous_space_compaction_by_oom_ = current_time;
              HomogeneousSpaceCompactResult result =
                  PERFORM_SUSPENDING_OPERATION(PerformHomogeneousSpaceCompact());
              // Thread suspension could have occurred.
              if ((was_default_allocator &amp;&amp; allocator != GetCurrentAllocator()) ||
                  (!instrumented &amp;&amp; EntrypointsInstrumented())) {
                return nullptr;
              }
              switch (result) {
                case HomogeneousSpaceCompactResult::kSuccess:
                  // If the allocation succeeded, we delayed an oom.
                  ptr = TryToAllocate<true, true>(self, allocator, alloc_size, bytes_allocated,
                                                  usable_size, bytes_tl_bulk_allocated);
                  if (ptr != nullptr) {
                    count_delayed_oom_++;
                  }
                  break;
                case HomogeneousSpaceCompactResult::kErrorReject:
                  // Reject due to disabled moving GC.
                  break;
                case HomogeneousSpaceCompactResult::kErrorVMShuttingDown:
                  // Throw OOM by default.
                  break;
                default: {
                  UNIMPLEMENTED(FATAL) << "homogeneous space compaction result: "
                      << static_cast<size_t>(result);
                  UNREACHABLE();
                }
              }
              // Always print that we ran homogeneous space compation since this can cause jank.
              VLOG(heap) << "Ran heap homogeneous space compaction, "
                        << " requested defragmentation "
                        << count_requested_homogeneous_space_compaction_.load()
                        << " performed defragmentation "
                        << count_performed_homogeneous_space_compaction_.load()
                        << " ignored homogeneous space compaction "
                        << count_ignored_homogeneous_space_compaction_.load()
                        << " delayed count = "
                        << count_delayed_oom_.load();
            }
            break;
          }
          default: {
            // Do nothing for others allocators.
          }
        }
      }
    #undef PERFORM_SUSPENDING_OPERATION
      // If the allocation hasn't succeeded by this point, throw an OOM error.
      if (ptr == nullptr) {
        ScopedAllowThreadSuspension ats;
        ThrowOutOfMemoryError(self, alloc_size, allocator);
      }
      return ptr;
    

    TryToAllocate会根据不同的allocator调用对应的Alloc,我们只看nonmoving的就行:

    case kAllocatorTypeNonMoving: {
          ret = non_moving_space_->Alloc(self,
                                         alloc_size,
                                         bytes_allocated,
                                         usable_size,
                                         bytes_tl_bulk_allocated);
    

    nonmovingspace 实际上是dlmallocspace, 通过层层调用,最终调用的地方如下:

    inline mirror::Object* DlMallocSpace::AllocWithoutGrowthLocked(
        Thread* /*self*/, size_t num_bytes,
        size_t* bytes_allocated,
        size_t* usable_size,
        size_t* bytes_tl_bulk_allocated) {
      mirror::Object* result = reinterpret_cast<mirror::Object*>(mspace_malloc(mspace_, num_bytes));
      if (LIKELY(result != nullptr)) {
        if (kDebugSpaces) {
          CHECK(Contains(result)) << "Allocation (" << reinterpret_cast<void*>(result)
                << ") not in bounds of allocation space " << *this;
        }
        size_t allocation_size = AllocationSizeNonvirtual(result, usable_size);
        DCHECK(bytes_allocated != nullptr);
        *bytes_allocated = allocation_size;
        *bytes_tl_bulk_allocated = allocation_size;
      }
      return result;
    }
    

    可以看到,是mspace_malloc失败了。mspace_malloc可以看成是在指定的space上malloc。接下来再看下如果这儿失败了,那OOM会如何处理:

    void Heap::ThrowOutOfMemoryError(Thread* self, size_t byte_count, AllocatorType allocator_type) {
      // If we're in a stack overflow, do not create a new exception. It would require running the
      // constructor, which will of course still be in a stack overflow.
      if (self->IsHandlingStackOverflow()) {
        self->SetException(
            Runtime::Current()->GetPreAllocatedOutOfMemoryErrorWhenHandlingStackOverflow());
        return;
      }
    
      std::ostringstream oss;
      size_t total_bytes_free = GetFreeMemory();
      // 这个就是我们看到的OOM 日志的前一半了,重点是要找后一半
      oss << "Failed to allocate a " << byte_count << " byte allocation with " << total_bytes_free
          << " free bytes and " << PrettySize(GetFreeMemoryUntilOOME()) << " until OOM,"
          << " target footprint " << target_footprint_.load(std::memory_order_relaxed)
          << ", growth limit "
          << growth_limit_;
      // If the allocation failed due to fragmentation, print out the largest continuous allocation.
      // 只要剩余空间大于申请空间,那么就会继续从对应space中看是否是碎片问题。
      if (total_bytes_free >= byte_count) {
        space::AllocSpace* space = nullptr;
        if (allocator_type == kAllocatorTypeNonMoving) {
          space = non_moving_space_;
        } else if (allocator_type == kAllocatorTypeRosAlloc ||
                   allocator_type == kAllocatorTypeDlMalloc) {
          space = main_space_;
        } else if (allocator_type == kAllocatorTypeBumpPointer ||
                   allocator_type == kAllocatorTypeTLAB) {
          space = bump_pointer_space_;
        } else if (allocator_type == kAllocatorTypeRegion ||
                   allocator_type == kAllocatorTypeRegionTLAB) {
          space = region_space_;
        }
    
        // There is no fragmentation info to log for large-object space.
        if (allocator_type != kAllocatorTypeLOS) {
          CHECK(space != nullptr) << "allocator_type:" << allocator_type
                                  << " byte_count:" << byte_count
                                  << " total_bytes_free:" << total_bytes_free;
          // LogFragmentationAllocFailure returns true if byte_count is greater than
          // the largest free contiguous chunk in the space. Return value false
          // means that we are throwing OOME because the amount of free heap after
          // GC is less than kMinFreeHeapAfterGcForAlloc in proportion of the heap-size.
          // Log an appropriate message in that case.
          if (!space->LogFragmentationAllocFailure(oss, byte_count)) {
            oss << "; giving up on allocation because <"
                << kMinFreeHeapAfterGcForAlloc * 100
                << "% of heap free after GC.";
          }
        }
      }
      self->ThrowOutOfMemoryError(oss.str().c_str());
    }
    

    由于我们的allcator是nonmoving,而nonmoving又是AlmallocSpace,内部的逻辑如下:

    bool DlMallocSpace::LogFragmentationAllocFailure(std::ostream&amp; os,
                                                     size_t failed_alloc_bytes) {
      Thread* const self = Thread::Current();
      size_t max_contiguous_allocation = 0;
      // To allow the Walk/InspectAll() to exclusively-lock the mutator
      // lock, temporarily release the shared access to the mutator
      // lock here by transitioning to the suspended state.
      Locks::mutator_lock_->AssertSharedHeld(self);
      ScopedThreadSuspension sts(self, ThreadState::kSuspended);
      Walk(MSpaceChunkCallback, &amp;max_contiguous_allocation);
      if (failed_alloc_bytes > max_contiguous_allocation) {
        os << "; failed due to fragmentation (largest possible contiguous allocation "
           <<  max_contiguous_allocation << " bytes)";
        return true;
      }
      return false;
    }
    

    现在的最大连续空余空间肯定会小于我们的申请空间,否则我们就分配成功了。所以这儿就会报碎片了。
    到了这儿基本所有的疑惑都可以解答了。

    1. 为什么OOM?是因为mspace_malloc失败了,是malloc失败无非就是无可用的虚拟地址用来分配了,也就是native层内存泄漏了;
    2. 为什么是报碎片问题,因为虚拟机以为现在还有足够的空余空间,而最大连续空间又小于申请空间,于是就认为是碎片了

    接下来做一个实验验证下分析结论:

    extern "C" JNIEXPORT jstring JNICALL
    Java_com_example_memleakdemo_MainActivity_stringFromJNI(
            JNIEnv* env,
            jobject /* this */) {
        for (int i =0; i < 2000; i ++) {
            void *p = malloc(1 * 1024 * 1024);
        }
        std::string hello = "Hello from C++";
        return env->NewStringUTF(hello.c_str());
    }
    
    private List<ByteBuffer> list = new ArrayList<>();
    
    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
    
        binding = ActivityMainBinding.inflate(getLayoutInflater());
        setContentView(binding.getRoot());
    
        // Example of a call to a native method
        TextView tv = binding.sampleText;
        new Thread(new Runnable() {
            @Override
            public void run() {
                stringFromJNI();
                allocateBuffer();
            }
        }).start();
    }
    
    private void allocateBuffer() {
        int i = 0;
        while (true) {
            ByteBuffer bb = ByteBuffer.allocateDirect(10 * 1024 * 1024);
            list.add(bb);
            i ++;
            Log.i("lhr", "allocate " + i * 10 + " MB");
        }
    }
    

    找一个32位的手机运行下,结果如下:


    image.png

    验证了分析结论。

    相关文章

      网友评论

          本文标题:分享一例Android内存碎片OOM

          本文链接:https://www.haomeiwen.com/subject/dnxonrtx.html