美文网首页
[代码实践]styleGAN2扩展:从真实人脸中提取图像的lat

[代码实践]styleGAN2扩展:从真实人脸中提取图像的lat

作者: 祁晏晏 | 来源:发表于2020-07-22 16:51 被阅读0次

    写在前面的话

    此文大部分为整理,仅作记录使用,获取信息的来源已经全部附上链接。(希望以后也能有自己的思想输出)

    这位博主在他的博客中提到了多种重建真实图片的方法,包括:

    • styleGAN2官网的run_projector.py
    • rolux的project_images.py
    • rolux基于Puzer的encode_images.py
    • Pbaylies基于一代styleGAN的Encoder

    (这个博主的系列文章让我学到很多)

    此外,还有一些论文提到了重建的思路,都比较新,包括:

    • StyleGAN2 Distillation for Feed-forward Image Manipulation(styleGan当老师,用它的知识训练某个特征的分类器,得到latent code的差值,以后的latent code根据差值变换可得到同样元素不同特征的图片,https://github.com/EvgenyKashin/stylegan2-distillation
    • InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs(Interpreting the latent space of gans fot semantic face editing在隐空间中编码不同语义)
    • PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models(低分辨率生成高分辨率,完全自监督,但是部分情况下和真人不怎么像,https://github.com/adamian98/pulse
    • Adversarial Latent Autoencoders(StyleALAE不仅可以生成质量与StyleGAN相当的1024x1024人脸图像,而且在相同的分辨率下还可以基于真实图像生成人脸重建和操作。生成的还行,重建的效果还有待提高,https://github.com/podgorskiy/ALAE,有工具可以调节)
    • Image2StyleGAN++: How to Edit the Embedded Images(基于Image2Style的改进,同时更新latent和噪声,有修复功能)
    • MSG-GAN: Multi-Scale Gradient GAN for Stable Image Synthesis(基于progan, stylegan的高质量图像生成,靠trick,应该是SOTA, 这个开源了,还挺感兴趣的)

    综合工作压力和兴趣,我期望先将pbaylies的思路先在StyleGAN2上实现。

    原理和原始代码

    pbaylies基于《Precise Recovery of Latent Vectors from Generative Adversarial Networks》实现过stylegan-encoder,原理是:通过stochastic clipping,将learnable latent code限制在某个区间内,通过计算该latent code生成图片和原始图片间的损失函数迭代更新learnable latent code,即为该原始图片对应的latent。
    代码思路如下图所示:

    image
    image
    bpaylies的实现逻辑如上图所示,从输入一张image开始看,使用resnet50获取latent,最后使用多种loss计算损失函数从而实现迭代。

    让代码先跑起来

    搭建环境

    我使用conda搭建虚拟环境
    建议python=3.6, tensorflow=1.14, keras=2.3

    下载styleGAN2源码

    https://github.com/NVlabs/stylegan2

    Pbayies的代码需要用到第一版stylegan中的dnnlib库,该库在第二版中被删去了部分功能,因此需要下载第一版的dnnlib替换掉第二版的。
    (下文中的下载链接均来自闪闪·Style

    链接: https://pan.baidu.com/s/1j6O-bgrMn5jVFO_GrE4cew
    提取码: wjya

    下载stylegan-encoder源码

    https://github.com/pbaylies/stylegan-encoder

    下载其中的encoder和ffhq_dataset文件夹,将它们移动到stylegan2根目录下,encoder文件夹重命名为encoder_s1

    下载预训练模型

    下载perceptual model “vgg16_zhang_perceptual.pkl”,可以从百度网盘下载:
    https://pan.baidu.com/s/1vP6NM9-w4s3Cy6l4T7QpbQ
    提取码: 5qkp

    下载预训练的StyleGAN2人脸模型“stylegan2-ffhq-config-f.pkl”:
    百度网盘: https://pan.baidu.com/s/1_cRyamHP_Amj0srCbiB5_g
    提取码: cnby

    在stylegan2根目录下新建models文件夹,将下载好的预训练模型放进去。

    微调resnet反向网络

    这部分参考了两个链接:链接1链接2
    在stylegan2根目录下新建文件夹data,用来存储微调后的resnet50网络
    在stylegan2根目录下新建文件train_encoder.py用来finetune反向网络resnet50

    import os
    import numpy as np
    import cv2
    
    from keras.applications.imagenet_utils import preprocess_input
    from keras.layers import Dense, Reshape
    from keras.models import Sequential, Model, load_model
    from keras.applications.resnet50 import ResNet50
    from keras.optimizers import Adam
    
    import pretrained_networks
    import dnnlib.tflib as tflib
    
    
    def get_batch(batch_size, Gs, image_size=224, Gs_minibatch_size=12, w_mix=None, latent_size=18):
        """
        Generate a batch of size n for the model to train
        returns a tuple (W, X) with W.shape = [batch_size, latent_size, 512] and X.shape = [batch_size, image_size, image_size, 3]
        If w_mix is not None, W = w_mix * W0 + (1 - w_mix) * W1 with
            - W0 generated from Z0 such that W0[:,i] = constant
            - W1 generated from Z1 such that W1[:,i] != constant
    
        Parametersget_batch
        ----------
        batch_size : int
            batch size
        Gs
            StyleGan2 generator
        image_size : int
        Gs_minibatch_size : int
            batch size for the generator
        w_mix : float
    
        Returns
        -------
        tuple
            dlatent W, images X
        """
    
        # Generate W0 from Z0
        Z0 = np.random.randn(batch_size, Gs.input_shape[1])
        W0 = Gs.components.mapping.run(Z0, None, minibatch_size=Gs_minibatch_size)
    
        if w_mix is None:
            W = W0
        else:
            # Generate W1 from Z1
            Z1 = np.random.randn(latent_size * batch_size, Gs.input_shape[1])
            W1 = Gs.components.mapping.run(Z1, None, minibatch_size=Gs_minibatch_size)
            W1 = np.array([W1[batch_size * i:batch_size * (i + 1), i] for i in range(latent_size)]).transpose((1, 0, 2))
    
            # Mix styles between W0 and W1
            W = w_mix * W0 + (1 - w_mix) * W1
    
        # Generate X
        X = Gs.components.synthesis.run(W, randomize_noise=True, minibatch_size=Gs_minibatch_size, print_progress=True,
                                        output_transform=dict(func=tflib.convert_images_to_uint8, nchw_to_nhwc=True))
    
        # Preprocess images X for the Imagenet model
        X = np.array([cv2.resize(x, (image_size, image_size)) for x in X])
        X = preprocess_input(X.astype('float'))
    
        return W, X
    
    
    def finetune(save_path, image_size=224, base_model=ResNet50, batch_size=2048, test_size=1024, n_epochs=6,
                 max_patience=5, models_dir='models/stylegan2-ffhq-config-f.pkl'):
        """
        Finetunes a ResNet50 to predict W[:, 0]
    
        Parameters
        ----------
        save_path : str
            path where to save the Resnet
        image_size : int
        base_model : keras model
        batch_size :  int
        test_size : int
        n_epochs : int
        max_patience : int
    
        Returns
        -------
        None
    
        """
    
        assert image_size >= 224
    
        # Load StyleGan generator
        _, _, Gs = pretrained_networks.load_networks(models_dir)
    
        # Build model
        if os.path.exists(save_path):
            print('Loading pretrained network')
            model = load_model(save_path, compile=False)
        else:
            base = base_model(include_top=False, pooling='avg', input_shape=(image_size, image_size, 3))
            model = Sequential()
            model.add(base)
            model.add(Dense(512))
    
        model.compile(loss='mse', metrics=[], optimizer=Adam(3e-4))
        model.summary()
    
        # Create a test set
        print('Creating test set')
        W_test, X_test = get_batch(test_size, Gs)
    
        # Iterate on batches of size batch_size
        print('Training model')
        patience = 0
        best_loss = np.inf
    
        while (patience <= max_patience):
            W_train, X_train = get_batch(batch_size, Gs)
            model.fit(X_train, W_train[:, 0], epochs=n_epochs, verbose=True)
            loss = model.evaluate(X_test, W_test[:, 0])
            if loss < best_loss:
                print(f'New best test loss : {loss:.5f}')
                model.save(save_path)
                patience = 0
                best_loss = loss
            else:
                print(f'-------- test loss : {loss:.5f}')
                patience += 1
    
    
    def finetune_18(save_path, base_model=None, image_size=224, batch_size=2048, test_size=1024, n_epochs=6,
                    max_patience=8, w_mix=0.7, latent_size=18, models_dir='models/stylegan2-ffhq-config-f.pkl'):
        """
        Finetunes a ResNet50 to predict W[:, :]
    
        Parameters
        ----------
        save_path : str
            path where to save the Resnet
        image_size : int
        base_model : str
            path to the first finetuned ResNet50
        batch_size :  int
        test_size : int
        n_epochs : int
        max_patience : int
        w_mix : float
    
        Returns
        -------
        None
    
        """
    
        assert image_size >= 224
        if not os.path.exists(save_path):
            assert base_model is not None
    
        # Load StyleGan generator
        _, _, Gs = pretrained_networks.load_networks(models_dir)
    
        # Build model
        if os.path.exists(save_path):
            print('Loading pretrained network')
            model = load_model(save_path, compile=False)
        else:
            base_model = load_model(base_model)
            hidden = Dense(latent_size * 512)(base_model.layers[-1].input)
            outputs = Reshape((latent_size, 512))(hidden)
            model = Model(base_model.input, outputs)
            # Set initialize layer
            W, b = base_model.layers[-1].get_weights()
            model.layers[-2].set_weights([np.hstack([W] * latent_size), np.hstack([b] * latent_size)])
    
        model.compile(loss='mse', metrics=[], optimizer=Adam(1e-4))
        model.summary()
    
        # Create a test set
        print('Creating test set')
        W_test, X_test = get_batch(test_size, Gs, w_mix=w_mix, latent_size=latent_size)
    
        # Iterate on batches of size batch_size
        print('Training model')
        patience = 0
        best_loss = np.inf
    
        while (patience <= max_patience):
            W_train, X_train = get_batch(batch_size, Gs, w_mix=w_mix, latent_size=latent_size)
            model.fit(X_train, W_train, epochs=n_epochs, verbose=True)
            loss = model.evaluate(X_test, W_test)
            if loss < best_loss:
                print(f'New best test loss : {loss:.5f}')
                model.save(save_path)
                patience = 0
                best_loss = loss
            else:
                print(f'-------- test loss : {loss:.5f}')
                patience += 1
    
    
    if __name__ == '__main__':
        finetune('data/resnet.h5')
        finetune_18('data/resnet_18.h5', 'data/resnet.h5', w_mix=0.8)
        #finetune('data/resnet.h5', n_epochs=2, max_patience=1)
        #finetune_18('data/resnet_18.h5', 'data/resnet.h5', w_mix=0.8, n_epochs=2, max_patience=1)
    

    通过执行python train_encoder.py先将resnet跑起来。在跑起来的时候可能会遇到以下问题。

    1. tensorflow error This file requires compiler and library support for the ISO C++ 2011 standard

    基于stackoverflow,做如下修改:
    将dnnlib/tflib/custom_pos.py的第64行修正为cmd = 'nvcc --std=c++11 -DNDEBUG ' + opts.strip()

    2. undefined symbol: _ZN10tensorflow12OpDefBuilder6OutputESs

    基于CSDN博客,做如下修改:
    将dnnlib/tflib/custom_jpos.py的第127行修正为compiler-options \'-fPIC -D_GLIBCXX_USE_CXX11_ABI=1

    新增encode_image_s1.py

    在stylegan2根目录下新增文件encode_image_s1.py,相比原博主,更新了一些参数的定义,内容如下:

    import os
    import argparse
    import pickle
    
    from tqdm import tqdm
    import PIL.Image
    import numpy as np
    import dnnlib
    import dnnlib.tflib as tflib
    from encoder_s1.generator_model import Generator
    from encoder_s1.perceptual_model import PerceptualModel, load_images
    from keras.models import load_model
    
    import glob
    import random
    
    def str2bool(v):
        if isinstance(v, bool):
           return v
        if v.lower() in ('yes', 'true', 't', 'y', '1'):
            return True
        elif v.lower() in ('no', 'false', 'f', 'n', '0'):
            return False
        else:
            raise argparse.ArgumentTypeError('Boolean value expected.')
    
    
    def split_to_batches(l, n):
        for i in range(0, len(l), n):
            yield l[i:i + n]
    
    
    def main():
        parser = argparse.ArgumentParser(
            description='Find latent representation of reference images using perceptual losses',
            formatter_class=argparse.ArgumentDefaultsHelpFormatter)
        parser.add_argument('src_dir', help='Directory with images for encoding')
        parser.add_argument('generated_images_dir', help='Directory for storing generated images')
        parser.add_argument('dlatent_dir', help='Directory for storing dlatent representations')
        parser.add_argument('--data_dir', default='data', help='Directory for storing optional models')
        parser.add_argument('--mask_dir', default='masks', help='Directory for storing optional masks')
        parser.add_argument('--load_last', default='', help='Start with embeddings from directory')
        parser.add_argument('--dlatent_avg', default='',
                            help='Use dlatent from file specified here for truncation instead of dlatent_avg from Gs')
        parser.add_argument('--model_url', default='models/stylegan2-ffhq-config-f.pkl',
                            help='Fetch a StyleGAN model to train on from this URL')
        parser.add_argument('--model_res', default=1024, help='The dimension of images in the StyleGAN model', type=int)
        parser.add_argument('--batch_size', default=1, help='Batch size for generator and perceptual model', type=int)
    
        # Perceptual model params
        parser.add_argument('--image_size', default=256, help='Size of images for perceptual model', type=int)
        parser.add_argument('--sharpen_input', default=True, help='whether to add sharpen action for input images', type=bool)
        parser.add_argument('--resnet_image_size', default=224, help='Size of images for the Resnet model', type=int)
        parser.add_argument('--lr', default=0.02, help='Learning rate for perceptual model', type=float)
        parser.add_argument('--decay_rate', default=0.9, help='Decay rate for learning rate', type=float)
        parser.add_argument('--iterations', default=100, help='Number of optimization steps for each batch', type=int)
        parser.add_argument('--decay_steps', default=10,
                            help='Decay steps for learning rate decay (as a percent of iterations)', type=float)
        parser.add_argument('--load_effnet', default='data/finetuned_effnet.h5',
                            help='Model to load for EfficientNet approximation of dlatents')
        parser.add_argument('--load_resnet', default='data/resnet_18.h5',
                            help='Model to load for ResNet approximation of dlatents')
    
        # Loss function options
        parser.add_argument('--use_vgg_loss', default=0.4, help='Use VGG perceptual loss; 0 to disable, > 0 to scale.',
                            type=float)
        parser.add_argument('--use_adaptive_loss', default=False,
                            help='Use the adaptive robust loss function from Google Research for pixel and VGG feature loss.',
                            type=str2bool, nargs='?', const=True)
        parser.add_argument('--use_vgg_layer', default=9, help='Pick which VGG layer to use.', type=int)
        parser.add_argument('--use_pixel_loss', default=1.5,
                            help='Use logcosh image pixel loss; 0 to disable, > 0 to scale.', type=float)
        parser.add_argument('--use_mssim_loss', default=100, help='Use MS-SIM perceptual loss; 0 to disable, > 0 to scale.',
                            type=float)
        parser.add_argument('--use_lpips_loss', default=100, help='Use LPIPS perceptual loss; 0 to disable, > 0 to scale.',
                            type=float)
        parser.add_argument('--use_l1_penalty', default=1, help='Use L1 penalty on latents; 0 to disable, > 0 to scale.',
                            type=float)
        parser.add_argument('--use_discriminator_loss', default=0.5, help='Use trained discriminator to evaluate realism.',
                            type=float)
    
        # Generator params
        parser.add_argument('--randomize_noise', default=False, help='Add noise to dlatents during optimization', type=bool)
        parser.add_argument('--tile_dlatents', default=False, help='Tile dlatents to use a single vector at each scale',
                            type=bool)
        parser.add_argument('--clipping_threshold', default=2.0,
                            help='Stochastic clipping of gradient values outside of this threshold', type=float)
    
        # Masking params
        parser.add_argument('--load_mask', default=False, help='Load segmentation masks', type=bool)
        parser.add_argument('--face_mask', default=False, help='Generate a mask for predicting only the face area',
                            type=bool)
        parser.add_argument('--use_grabcut', default=True,
                            help='Use grabcut algorithm on the face mask to better segment the foreground', type=bool)
        parser.add_argument('--scale_mask', default=1.5, help='Look over a wider section of foreground for grabcut',
                            type=float)
    
        # Video params
        parser.add_argument('--video_dir', default='videos', help='Directory for storing training videos')
        parser.add_argument('--output_video', default=False, help='Generate videos of the optimization process', type=bool)
        parser.add_argument('--video_codec', default='MJPG', help='FOURCC-supported video codec name')
        parser.add_argument('--video_frame_rate', default=24, help='Video frames per second', type=int)
        parser.add_argument('--video_size', default=512, help='Video size in pixels', type=int)
        parser.add_argument('--video_skip', default=1, help='Only write every n frames (1 = write every frame)', type=int)
    
        # 获取到基本设置时,如果运行命令中传入了之后才会获取到的其他配置,不会报错;而是将多出来的部分保存起来,留到后面使用
        args, other_args = parser.parse_known_args()
    
        # learning rate衰减的steps
        args.decay_steps *= 0.01 * args.iterations  # Calculate steps as a percent of total iterations
    
        if args.output_video:
            import cv2
            synthesis_kwargs = dict(output_transform=dict(func=tflib.convert_images_to_uint8, nchw_to_nhwc=False),
                                    minibatch_size=args.batch_size)
    
        # 找到src_dir下所有图片文件,加入ref_images列表(即:源图的列表;只有一个图片也可以)
        ref_images = [os.path.join(args.src_dir, x) for x in os.listdir(args.src_dir)]
        ref_images = list(filter(os.path.isfile, ref_images))
    
        if len(ref_images) == 0:
            raise Exception('%s is empty' % args.src_dir)
    
        # 创建工作目录
        os.makedirs(args.data_dir, exist_ok=True)
        os.makedirs(args.mask_dir, exist_ok=True)
        os.makedirs(args.generated_images_dir, exist_ok=True)
        os.makedirs(args.dlatent_dir, exist_ok=True)
        os.makedirs(args.video_dir, exist_ok=True)
    
        # Initialize generator and perceptual model
        tflib.init_tf()
        # 加载StyleGAN模型
        model_file = glob.glob(args.model_url)
        if len(model_file) == 1:
            model_file = open(model_file[0], "rb")
        else:
            raise Exception('Failed to find the model')
        generator_network, discriminator_network, Gs_network = pickle.load(model_file)
    
        # 加载Generator类,参与构建VGG16 perceptual model,用于调用(说是生成,更好理解)generated_image
        # generated_image通过perceptual_model转化为generated_img_features,参与计算loss
        generator = Generator(Gs_network, args.batch_size, clipping_threshold=args.clipping_threshold,
                              tiled_dlatent=args.tile_dlatents, model_res=args.model_res,
                              randomize_noise=args.randomize_noise)
        if (args.dlatent_avg != ''):
            generator.set_dlatent_avg(np.load(args.dlatent_avg))
    
        perc_model = None
        if (args.use_lpips_loss > 0.00000001):  # '--use_lpips_loss', default = 100
            # 加载VGG16 perceptual模型
            model_file = glob.glob('./models/vgg16_zhang_perceptual.pkl')
            if len(model_file) == 1:
                model_file = open(model_file[0], "rb")
            else:
                raise Exception('Failed to find the model')
            perc_model = pickle.load(model_file)
    
        # 创建VGG16 perceptual模型
        perceptual_model = PerceptualModel(args, perc_model=perc_model, batch_size=args.batch_size)
        perceptual_model.build_perceptual_model(generator, discriminator_network)
    
        ff_model = None
        # Optimize (only) dlatents by minimizing perceptual loss between reference and generated images in feature space
        # tqdm 是一个快速,可扩展的Python进度条,可以在 Python 长循环中添加一个进度提示信息
        # 把ref_images分割为若干批次,每个批次的大小为args.batch_size,分批使用perceptual_model.optimize()求解每个源图的dlatents的最优解
        # 对每一个源图,优化迭代的过程是从一个初始dlatents开始,在某个空间内,按正态分布取值,使用Adam优化器,逐步寻找使loss最小的dlatents,即:stochastic clipping方法
        for images_batch in tqdm(split_to_batches(ref_images, args.batch_size), total=len(ref_images) // args.batch_size):
            # 读取每个批次中的文件名
            names = [os.path.splitext(os.path.basename(x))[0] for x in images_batch]
            if args.output_video:
                video_out = {}
                for name in names:
                    video_out[name] = cv2.VideoWriter(os.path.join(args.video_dir, f'{name}.avi'),
                                                      cv2.VideoWriter_fourcc(*args.video_codec), args.video_frame_rate,
                                                      (args.video_size, args.video_size))
    
            # 给源图及源图用VGG16生成的features赋值(这是计算loss的基准)
            perceptual_model.set_reference_images(images_batch)
            dlatents = None
            if (args.load_last != ''):  # load previous dlatents for initialization
                for name in names:
                    dl = np.expand_dims(np.load(os.path.join(args.load_last, f'{name}.npy')), axis=0)
                    if (dlatents is None):
                        dlatents = dl
                    else:
                        dlatents = np.vstack((dlatents, dl))
            else:
                if (ff_model is None):
                    if os.path.exists(args.load_resnet):
                        print("Loading ResNet Model:")
                        ff_model = load_model(args.load_resnet)
                        from keras.applications.resnet50 import preprocess_input
                if (ff_model is None):
                    if os.path.exists(args.load_effnet):
                        import efficientnet
                        print("Loading EfficientNet Model:")
                        ff_model = load_model(args.load_effnet)
                        from efficientnet import preprocess_input
                if (ff_model is not None):  # predict initial dlatents with ResNet model
                    dlatents = ff_model.predict(
                        preprocess_input(load_images(images_batch, image_size=args.resnet_image_size)))
            # 设置用于perceptual_model优化迭代的初始值dlatents,它是用resnet50或者efficientnet从源图预测得到的
            if dlatents is not None:
                generator.set_dlatents(dlatents)
            # 对每一个源图,用tqdm构造进度条,显示优化迭代的过程
            op = perceptual_model.optimize(generator.dlatent_variable, iterations=args.iterations)
            pbar = tqdm(op, leave=False, total=args.iterations)
            vid_count = 0
            best_loss = None
            best_dlatent = None
            # 用stochastic clipping方法,使用VGG16 perceptual_model进行优化迭代,迭代次数为iterations=args.iterations
            for loss_dict in pbar:
                pbar.set_description(" ".join(names) + ": " + "; ".join(["{} {:.4f}".format(k, v)
                                                                         for k, v in loss_dict.items()]))
                if best_loss is None or loss_dict["loss"] < best_loss:
                    best_loss = loss_dict["loss"]
                    best_dlatent = generator.get_dlatents()
                if args.output_video and (vid_count % args.video_skip == 0):
                    batch_frames = generator.generate_images()
                    for i, name in enumerate(names):
                        video_frame = PIL.Image.fromarray(batch_frames[i], 'RGB').resize((args.video_size, args.video_size),
                                                                                         PIL.Image.LANCZOS)
                        video_out[name].write(cv2.cvtColor(np.array(video_frame).astype('uint8'), cv2.COLOR_RGB2BGR))
                # 用stochastic clip方法更新dlatent_variable
                generator.stochastic_clip_dlatents()
            print(" ".join(names), " Loss {:.4f}".format(best_loss))
    
            if args.output_video:
                for name in names:
                    video_out[name].release()
    
            # Generate images from found dlatents and save them
            generator.set_dlatents(best_dlatent)
            generated_images = generator.generate_images()
            generated_dlatents = generator.get_dlatents()
            for img_array, dlatent, img_name in zip(generated_images, generated_dlatents, names):
                img = PIL.Image.fromarray(img_array, 'RGB')
                img.save(os.path.join(args.generated_images_dir, f'{img_name}.png'), 'PNG')
                np.save(os.path.join(args.dlatent_dir, f'{img_name}.npy'), dlatent)
    
            generator.reset_dlatents()
    
    
    if __name__ == "__main__":
        main()
    

    处理数据

    我使用了style-encoder中的align_images.py处理图像,将原始图片对齐,裁剪

    conda create -n tf1.14 python=3.6
    source activate tf1.14
    conda install tensorflow-gpu=1.14 keras=2.3
    conda install -c menpo dlib # 用来对齐数据集,我的环境问题主要就出在dlib安装
    conda install pillow
    pip install opencv-python
    conda install requests tqdm
    

    python align_images.py images/input_images_origin/ images/input_images_align

    生成latent code

    python encode_images_s1.py images/raw_images/ images/generated_images/ images/latent_representations/ --load_resnet data/resnet_18.h5 --batch_size 1 --iterations 100

    生成的图片在images/generated_images/ 可以看到。

    生成效果

    代码虽然跑通了,但训练得还不好
    左图:生成图像,右图:原始图像


    image.png

    参考

    https://blog.csdn.net/weixin_41943311/article/details/103030194
    https://blog.csdn.net/DLW__/article/details/104528609
    https://github.com/NVlabs/stylegan2

    相关文章

      网友评论

          本文标题:[代码实践]styleGAN2扩展:从真实人脸中提取图像的lat

          本文链接:https://www.haomeiwen.com/subject/jqhtlktx.html