问题介绍
最近有遇到一种OOM问题
image.png
意思就是出现Java 内存碎片了,明明还有231MB空间可以使用,可是分配1.2M内存失败了。
普通的分析思路
这个问题如何分析?
最直接的思路可能就是认为Java内存使用出问题了,最可能的就是OOM堆栈这块内存使用不合理了。然后找到关键字"碎片",既然是碎片,那么就搞一个类似于对象池的机制,就可以解决了。不过这儿会有3个疑问:
- 应用如何告诉sdk可以将对象回收到对象池里了?技术不难实现,可是在不修改接口前提下做到就比较复杂了
- 应用何时告诉sdk可以回收对象了?
- 这个对象池如何管理,允许管理多少对象?
这样想想就会发现需要考虑的东西还很多,虽然可以通过虚引用的技术可以在对象被释放的时候被我们感知到,勉勉强强可以实现1,2的部分功能,可是如果要真正可以做到释放对象后让对象池托管还是比较难。
到了这儿可能会觉得这个思路行不通了,确实是。而且还有最关键的一个问题,还有231M的内存,为什么就分配不出来了?即使是有碎片,我还有231M的内存,分配1.2M还是很容易啊。
看起来就需要走正规的分析思路了。
正规的分析思路
分析问题如同中医的看病,走望闻问切的套路。
首先是望,望就是看现象,在bugly看看发生oom时候的现场:
[图片上传失败...(image-5f4472-1662605466284)]
从该信息上,可以看出来的结论如下:
- DirectBuffer 分配内存失败了;
- 实际上虚拟机还有足够的空间
- 设备是32位的,继续观察发生OOM的机器,可以发现都是32位的
接下来是闻,看看发生OOM时的场景等,比如内存使用,用户操作等,前后台等,这儿最关键的就是内存使用,看了内存后基本结论如下:
- OOM发生在运行了一段时间后,尤其是时间长 时候概率最高
- 应用占用内存在稳步上升,可以确定一定有内存泄漏问题
接下来是问,咨询下客户是否可以重现,这块客户反馈测试重现不了,没有有效信息。
最后就是切了,开始分析找rootcause了。
从DirectBuffer 分配内存失败开始,最有效的方法就是从源码中找原因:
public static ByteBuffer allocateDirect(int capacity) {
// Android-changed: Android's DirectByteBuffers carry a MemoryRef.
// return new DirectByteBuffer(capacity);
DirectByteBuffer.MemoryRef memoryRef = new DirectByteBuffer.MemoryRef(capacity);
return new DirectByteBuffer(capacity, memoryRef);
}
MemoryRef(int capacity) {
VMRuntime runtime = VMRuntime.getRuntime();
buffer = (byte[]) runtime.newNonMovableArray(byte.class, capacity + 7);
allocatedAddress = runtime.addressOf(buffer);
// Offset is set to handle the alignment: http://b/16449607
offset = (int) (((allocatedAddress + 7) & ~(long) 7) - allocatedAddress);
isAccessible = true;
isFreed = false;
originalBufferObject = null;
}
可以看到是在nonmoving 空间中分配内存了
static jobject VMRuntime_newNonMovableArray(JNIEnv* env, jobject, jclass javaElementClass,
jint length) {
ScopedFastNativeObjectAccess soa(env);
if (UNLIKELY(length < 0)) {
ThrowNegativeArraySizeException(length);
return nullptr;
}
ObjPtr<mirror::Class> element_class = soa.Decode<mirror::Class>(javaElementClass);
if (UNLIKELY(element_class == nullptr)) {
ThrowNullPointerException("element class == null");
return nullptr;
}
Runtime* runtime = Runtime::Current();
ObjPtr<mirror::Class> array_class =
runtime->GetClassLinker()->FindArrayClass(soa.Self(), element_class);
if (UNLIKELY(array_class == nullptr)) {
return nullptr;
}
gc::AllocatorType allocator = runtime->GetHeap()->GetCurrentNonMovingAllocator();
ObjPtr<mirror::Array> result = mirror::Array::Alloc(soa.Self(),
array_class,
length,
array_class->GetComponentSizeShift(),
allocator);
return soa.AddLocalReference<jobject>(result);
}
这儿的allocator是kAllocatorTypeNonMoving,接下来就是虚拟机分配内存的流程了:
template <bool kIsInstrumented, bool kFillUsable>
inline ObjPtr<Array> Array::Alloc(Thread* self,
ObjPtr<Class> array_class,
int32_t component_count,
size_t component_size_shift,
gc::AllocatorType allocator_type) {
DCHECK(allocator_type != gc::kAllocatorTypeLOS);
DCHECK(array_class != nullptr);
DCHECK(array_class->IsArrayClass());
DCHECK_EQ(array_class->GetComponentSizeShift(), component_size_shift);
DCHECK_EQ(array_class->GetComponentSize(), (1U << component_size_shift));
size_t size = ComputeArraySize(component_count, component_size_shift);
#ifdef __LP64__
// 64-bit. No size_t overflow.
DCHECK_NE(size, 0U);
#else
// 32-bit.
if (UNLIKELY(size == 0)) {
self->ThrowOutOfMemoryError(android::base::StringPrintf("%s of length %d would overflow",
array_class->PrettyDescriptor().c_str(),
component_count).c_str());
return nullptr;
}
#endif
gc::Heap* heap = Runtime::Current()->GetHeap();
ObjPtr<Array> result;
if (!kFillUsable) {
SetLengthVisitor visitor(component_count);
result = ObjPtr<Array>::DownCast(
heap->AllocObjectWithAllocator<kIsInstrumented>(
self, array_class, size, allocator_type, visitor));
} else {
SetLengthToUsableSizeVisitor visitor(component_count,
DataOffset(1U << component_size_shift).SizeValue(),
component_size_shift);
result = ObjPtr<Array>::DownCast(
heap->AllocObjectWithAllocator<kIsInstrumented>(
self, array_class, size, allocator_type, visitor));
}
if (kIsDebugBuild && result != nullptr && Runtime::Current()->IsStarted()) {
array_class = result->GetClass(); // In case the array class moved.
CHECK_EQ(array_class->GetComponentSize(), 1U << component_size_shift);
if (!kFillUsable) {
CHECK_EQ(result->SizeOf(), size);
} else {
CHECK_GE(result->SizeOf(), size);
}
}
return result;
}
可以看到最终还是走到了heap中的AllocObjectWithAllocator开始真正的分配, 流程也就是先分配,失败后就会gc,然后再分配,再gc + 扩容,然后再分配,如果还失败,有必要的话还会整理下内存,处理碎片,对于nonmoving就不会整理了,直接会抛异常。接下来我们就可以看到OOM中的内存碎片是如何来的了。
看一下关键流程:
mirror::Object* Heap::AllocateInternalWithGc(Thread* self,
AllocatorType allocator,
bool instrumented,
size_t alloc_size,
size_t* bytes_allocated,
size_t* usable_size,
size_t* bytes_tl_bulk_allocated,
ObjPtr<mirror::Class>* klass) {
bool was_default_allocator = allocator == GetCurrentAllocator();
// Make sure there is no pending exception since we may need to throw an OOME.
self->AssertNoPendingException();
DCHECK(klass != nullptr);
StackHandleScope<1> hs(self);
HandleWrapperObjPtr<mirror::Class> h_klass(hs.NewHandleWrapper(klass));
auto send_object_pre_alloc =
[&]() REQUIRES_SHARED(Locks::mutator_lock_) REQUIRES(!Roles::uninterruptible_) {
if (UNLIKELY(instrumented)) {
AllocationListener* l = alloc_listener_.load(std::memory_order_seq_cst);
if (UNLIKELY(l != nullptr) && UNLIKELY(l->HasPreAlloc())) {
l->PreObjectAllocated(self, h_klass, &alloc_size);
}
}
};
#define PERFORM_SUSPENDING_OPERATION(op) \
[&]() REQUIRES(Roles::uninterruptible_) REQUIRES_SHARED(Locks::mutator_lock_) { \
ScopedAllowThreadSuspension ats; \
auto res = (op); \
send_object_pre_alloc(); \
return res; \
}()
// The allocation failed. If the GC is running, block until it completes, and then retry the
// allocation.
collector::GcType last_gc =
PERFORM_SUSPENDING_OPERATION(WaitForGcToComplete(kGcCauseForAlloc, self));
// If we were the default allocator but the allocator changed while we were suspended,
// abort the allocation.
if ((was_default_allocator && allocator != GetCurrentAllocator()) ||
(!instrumented && EntrypointsInstrumented())) {
return nullptr;
}
uint32_t starting_gc_num = GetCurrentGcNum();
if (last_gc != collector::kGcTypeNone) {
// A GC was in progress and we blocked, retry allocation now that memory has been freed.
mirror::Object* ptr = TryToAllocate<true, false>(self, allocator, alloc_size, bytes_allocated,
usable_size, bytes_tl_bulk_allocated);
if (ptr != nullptr) {
return ptr;
}
}
// 判断是否回收了足够的内存,如果剩余空间够,那么在分配失败时,还会继续扩容再分配,这儿就是搞明白上述一系列问题的关键
auto have_reclaimed_enough = [&]() {
size_t curr_bytes_allocated = GetBytesAllocated();
double curr_free_heap =
static_cast<double>(growth_limit_ - curr_bytes_allocated) / growth_limit_;
return curr_free_heap >= kMinFreeHeapAfterGcForAlloc;
};
// We perform one GC as per the next_gc_type_ (chosen in GrowForUtilization),
// if it's not already tried. If that doesn't succeed then go for the most
// exhaustive option. Perform a full-heap collection including clearing
// SoftReferences. In case of ConcurrentCopying, it will also ensure that
// all regions are evacuated. If allocation doesn't succeed even after that
// then there is no hope, so we throw OOME.
collector::GcType tried_type = next_gc_type_;
if (last_gc < tried_type) {
const bool gc_ran = PERFORM_SUSPENDING_OPERATION(
CollectGarbageInternal(tried_type, kGcCauseForAlloc, false, starting_gc_num + 1)
!= collector::kGcTypeNone);
if ((was_default_allocator && allocator != GetCurrentAllocator()) ||
(!instrumented && EntrypointsInstrumented())) {
return nullptr;
}
if (gc_ran && have_reclaimed_enough()) {
mirror::Object* ptr = TryToAllocate<true, false>(self, allocator,
alloc_size, bytes_allocated,
usable_size, bytes_tl_bulk_allocated);
if (ptr != nullptr) {
return ptr;
}
}
}
// Most allocations should have succeeded by now, so the heap is really full, really fragmented,
// or the requested size is really big. Do another GC, collecting SoftReferences this time. The
// VM spec requires that all SoftReferences have been collected and cleared before throwing
// OOME.
VLOG(gc) << "Forcing collection of SoftReferences for " << PrettySize(alloc_size)
<< " allocation";
// TODO: Run finalization, but this may cause more allocations to occur.
// We don't need a WaitForGcToComplete here either.
// TODO: Should check whether another thread already just ran a GC with soft
// references.
DCHECK(!gc_plan_.empty());
pre_oome_gc_count_.fetch_add(1, std::memory_order_relaxed);
PERFORM_SUSPENDING_OPERATION(
CollectGarbageInternal(gc_plan_.back(), kGcCauseForAlloc, true, GC_NUM_ANY));
if ((was_default_allocator && allocator != GetCurrentAllocator()) ||
(!instrumented && EntrypointsInstrumented())) {
return nullptr;
}
mirror::Object* ptr = nullptr;
// 这儿肯定扩容了,毕竟我们OOM时还剩下好几百M空间,可以判断出来是这儿返回的ptr一定是null,这样才会走到后面的ThrowOutOfMemoryError。
if (have_reclaimed_enough()) {
ptr = TryToAllocate<true, true>(self, allocator, alloc_size, bytes_allocated,
usable_size, bytes_tl_bulk_allocated);
}
if (ptr == nullptr) {
const uint64_t current_time = NanoTime();
switch (allocator) {
case kAllocatorTypeRosAlloc:
// Fall-through.
case kAllocatorTypeDlMalloc: {
if (use_homogeneous_space_compaction_for_oom_ &&
current_time - last_time_homogeneous_space_compaction_by_oom_ >
min_interval_homogeneous_space_compaction_by_oom_) {
last_time_homogeneous_space_compaction_by_oom_ = current_time;
HomogeneousSpaceCompactResult result =
PERFORM_SUSPENDING_OPERATION(PerformHomogeneousSpaceCompact());
// Thread suspension could have occurred.
if ((was_default_allocator && allocator != GetCurrentAllocator()) ||
(!instrumented && EntrypointsInstrumented())) {
return nullptr;
}
switch (result) {
case HomogeneousSpaceCompactResult::kSuccess:
// If the allocation succeeded, we delayed an oom.
ptr = TryToAllocate<true, true>(self, allocator, alloc_size, bytes_allocated,
usable_size, bytes_tl_bulk_allocated);
if (ptr != nullptr) {
count_delayed_oom_++;
}
break;
case HomogeneousSpaceCompactResult::kErrorReject:
// Reject due to disabled moving GC.
break;
case HomogeneousSpaceCompactResult::kErrorVMShuttingDown:
// Throw OOM by default.
break;
default: {
UNIMPLEMENTED(FATAL) << "homogeneous space compaction result: "
<< static_cast<size_t>(result);
UNREACHABLE();
}
}
// Always print that we ran homogeneous space compation since this can cause jank.
VLOG(heap) << "Ran heap homogeneous space compaction, "
<< " requested defragmentation "
<< count_requested_homogeneous_space_compaction_.load()
<< " performed defragmentation "
<< count_performed_homogeneous_space_compaction_.load()
<< " ignored homogeneous space compaction "
<< count_ignored_homogeneous_space_compaction_.load()
<< " delayed count = "
<< count_delayed_oom_.load();
}
break;
}
default: {
// Do nothing for others allocators.
}
}
}
#undef PERFORM_SUSPENDING_OPERATION
// If the allocation hasn't succeeded by this point, throw an OOM error.
if (ptr == nullptr) {
ScopedAllowThreadSuspension ats;
ThrowOutOfMemoryError(self, alloc_size, allocator);
}
return ptr;
TryToAllocate会根据不同的allocator调用对应的Alloc,我们只看nonmoving的就行:
case kAllocatorTypeNonMoving: {
ret = non_moving_space_->Alloc(self,
alloc_size,
bytes_allocated,
usable_size,
bytes_tl_bulk_allocated);
nonmovingspace 实际上是dlmallocspace, 通过层层调用,最终调用的地方如下:
inline mirror::Object* DlMallocSpace::AllocWithoutGrowthLocked(
Thread* /*self*/, size_t num_bytes,
size_t* bytes_allocated,
size_t* usable_size,
size_t* bytes_tl_bulk_allocated) {
mirror::Object* result = reinterpret_cast<mirror::Object*>(mspace_malloc(mspace_, num_bytes));
if (LIKELY(result != nullptr)) {
if (kDebugSpaces) {
CHECK(Contains(result)) << "Allocation (" << reinterpret_cast<void*>(result)
<< ") not in bounds of allocation space " << *this;
}
size_t allocation_size = AllocationSizeNonvirtual(result, usable_size);
DCHECK(bytes_allocated != nullptr);
*bytes_allocated = allocation_size;
*bytes_tl_bulk_allocated = allocation_size;
}
return result;
}
可以看到,是mspace_malloc失败了。mspace_malloc可以看成是在指定的space上malloc。接下来再看下如果这儿失败了,那OOM会如何处理:
void Heap::ThrowOutOfMemoryError(Thread* self, size_t byte_count, AllocatorType allocator_type) {
// If we're in a stack overflow, do not create a new exception. It would require running the
// constructor, which will of course still be in a stack overflow.
if (self->IsHandlingStackOverflow()) {
self->SetException(
Runtime::Current()->GetPreAllocatedOutOfMemoryErrorWhenHandlingStackOverflow());
return;
}
std::ostringstream oss;
size_t total_bytes_free = GetFreeMemory();
// 这个就是我们看到的OOM 日志的前一半了,重点是要找后一半
oss << "Failed to allocate a " << byte_count << " byte allocation with " << total_bytes_free
<< " free bytes and " << PrettySize(GetFreeMemoryUntilOOME()) << " until OOM,"
<< " target footprint " << target_footprint_.load(std::memory_order_relaxed)
<< ", growth limit "
<< growth_limit_;
// If the allocation failed due to fragmentation, print out the largest continuous allocation.
// 只要剩余空间大于申请空间,那么就会继续从对应space中看是否是碎片问题。
if (total_bytes_free >= byte_count) {
space::AllocSpace* space = nullptr;
if (allocator_type == kAllocatorTypeNonMoving) {
space = non_moving_space_;
} else if (allocator_type == kAllocatorTypeRosAlloc ||
allocator_type == kAllocatorTypeDlMalloc) {
space = main_space_;
} else if (allocator_type == kAllocatorTypeBumpPointer ||
allocator_type == kAllocatorTypeTLAB) {
space = bump_pointer_space_;
} else if (allocator_type == kAllocatorTypeRegion ||
allocator_type == kAllocatorTypeRegionTLAB) {
space = region_space_;
}
// There is no fragmentation info to log for large-object space.
if (allocator_type != kAllocatorTypeLOS) {
CHECK(space != nullptr) << "allocator_type:" << allocator_type
<< " byte_count:" << byte_count
<< " total_bytes_free:" << total_bytes_free;
// LogFragmentationAllocFailure returns true if byte_count is greater than
// the largest free contiguous chunk in the space. Return value false
// means that we are throwing OOME because the amount of free heap after
// GC is less than kMinFreeHeapAfterGcForAlloc in proportion of the heap-size.
// Log an appropriate message in that case.
if (!space->LogFragmentationAllocFailure(oss, byte_count)) {
oss << "; giving up on allocation because <"
<< kMinFreeHeapAfterGcForAlloc * 100
<< "% of heap free after GC.";
}
}
}
self->ThrowOutOfMemoryError(oss.str().c_str());
}
由于我们的allcator是nonmoving,而nonmoving又是AlmallocSpace,内部的逻辑如下:
bool DlMallocSpace::LogFragmentationAllocFailure(std::ostream& os,
size_t failed_alloc_bytes) {
Thread* const self = Thread::Current();
size_t max_contiguous_allocation = 0;
// To allow the Walk/InspectAll() to exclusively-lock the mutator
// lock, temporarily release the shared access to the mutator
// lock here by transitioning to the suspended state.
Locks::mutator_lock_->AssertSharedHeld(self);
ScopedThreadSuspension sts(self, ThreadState::kSuspended);
Walk(MSpaceChunkCallback, &max_contiguous_allocation);
if (failed_alloc_bytes > max_contiguous_allocation) {
os << "; failed due to fragmentation (largest possible contiguous allocation "
<< max_contiguous_allocation << " bytes)";
return true;
}
return false;
}
现在的最大连续空余空间肯定会小于我们的申请空间,否则我们就分配成功了。所以这儿就会报碎片了。
到了这儿基本所有的疑惑都可以解答了。
- 为什么OOM?是因为mspace_malloc失败了,是malloc失败无非就是无可用的虚拟地址用来分配了,也就是native层内存泄漏了;
- 为什么是报碎片问题,因为虚拟机以为现在还有足够的空余空间,而最大连续空间又小于申请空间,于是就认为是碎片了
接下来做一个实验验证下分析结论:
extern "C" JNIEXPORT jstring JNICALL
Java_com_example_memleakdemo_MainActivity_stringFromJNI(
JNIEnv* env,
jobject /* this */) {
for (int i =0; i < 2000; i ++) {
void *p = malloc(1 * 1024 * 1024);
}
std::string hello = "Hello from C++";
return env->NewStringUTF(hello.c_str());
}
private List<ByteBuffer> list = new ArrayList<>();
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
binding = ActivityMainBinding.inflate(getLayoutInflater());
setContentView(binding.getRoot());
// Example of a call to a native method
TextView tv = binding.sampleText;
new Thread(new Runnable() {
@Override
public void run() {
stringFromJNI();
allocateBuffer();
}
}).start();
}
private void allocateBuffer() {
int i = 0;
while (true) {
ByteBuffer bb = ByteBuffer.allocateDirect(10 * 1024 * 1024);
list.add(bb);
i ++;
Log.i("lhr", "allocate " + i * 10 + " MB");
}
}
找一个32位的手机运行下,结果如下:
image.png
验证了分析结论。
网友评论