Skip to content

Commit

Permalink
Merge tag 'kvm-s390-20140422' of git://git.kernel.org/pub/scm/linux/k…
Browse files Browse the repository at this point in the history
…ernel/git/kvms390/linux into queue

Lazy storage key handling
-------------------------
Linux does not use the ACC and F bits of the storage key. Newer Linux
versions also do not use the storage keys for dirty and reference
tracking. We can optimize the guest handling for those guests for faults
as well as page-in and page-out by simply not caring about the guest
visible storage key. We trap guest storage key instruction to enable
those keys only on demand.

Migration bitmap

Until now s390 never provided a proper dirty bitmap.  Let's provide a
proper migration bitmap for s390. We also change the user dirty tracking
to a fault based mechanism. This makes the host completely independent
from the storage keys. Long term this will allow us to back guest memory
with large pages.

per-VM device attributes
------------------------
To avoid the introduction of new ioctls, let's provide the
attribute semanantic also on the VM-"device".

Userspace controlled CMMA
-------------------------
The CMMA assist is changed from "always on" to "on if requested" via
per-VM device attributes. In addition a callback to reset all usage
states is provided.

Proper guest DAT handling for intercepts
----------------------------------------
While instructions handled by SIE take care of all addressing aspects,
KVM/s390 currently does not care about guest address translation of
intercepts. This worked out fine, because
- the s390 Linux kernel has a 1:1 mapping between kernel virtual<->real
 for all pages up to memory size
- intercepts happen only for a small amount of cases
- all of these intercepts happen to be in the kernel text for current
  distros

Of course we need to be better for other intercepts, kernel modules etc.
We provide the infrastructure and rework all in-kernel intercepts to work
on logical addresses (paging etc) instead of real ones. The code has
been running internally for several months now, so it is time for going
public.

GDB support
-----------
We provide breakpoints, single stepping and watchpoints.

Fixes/Cleanups
--------------
- Improve program check delivery
- Factor out the handling of transactional memory  on program checks
- Use the existing define __LC_PGM_TDB
- Several cleanups in the lowcore structure
- Documentation

NOTES
-----
- All patches touching base s390 are either ACKed or written by the s390
  maintainers
- One base KVM patch "KVM: add kvm_is_error_gpa() helper"
- One patch introduces the notion of VM device attributes

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

Conflicts:
	include/uapi/linux/kvm.h
  • Loading branch information
matosatti committed Apr 22, 2014
2 parents 5c7411e + e325fe6 commit 63b5cf0
Show file tree
Hide file tree
Showing 33 changed files with 2,831 additions and 501 deletions.
8 changes: 4 additions & 4 deletions Documentation/virtual/kvm/api.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2314,8 +2314,8 @@ struct kvm_create_device {

4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR

Capability: KVM_CAP_DEVICE_CTRL
Type: device ioctl
Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device
Type: device ioctl, vm ioctl
Parameters: struct kvm_device_attr
Returns: 0 on success, -1 on error
Errors:
Expand All @@ -2340,8 +2340,8 @@ struct kvm_device_attr {

4.81 KVM_HAS_DEVICE_ATTR

Capability: KVM_CAP_DEVICE_CTRL
Type: device ioctl
Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device
Type: device ioctl, vm ioctl
Parameters: struct kvm_device_attr
Returns: 0 on success, -1 on error
Errors:
Expand Down
26 changes: 26 additions & 0 deletions Documentation/virtual/kvm/devices/vm.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
Generic vm interface
====================================

The virtual machine "device" also accepts the ioctls KVM_SET_DEVICE_ATTR,
KVM_GET_DEVICE_ATTR, and KVM_HAS_DEVICE_ATTR. The interface uses the same
struct kvm_device_attr as other devices, but targets VM-wide settings
and controls.

The groups and attributes per virtual machine, if any, are architecture
specific.

1. GROUP: KVM_S390_VM_MEM_CTRL
Architectures: s390

1.1. ATTRIBUTE: KVM_S390_VM_MEM_CTRL
Parameters: none
Returns: -EBUSY if already a vcpus is defined, otherwise 0

Enables CMMA for the virtual machine

1.2. ATTRIBUTE: KVM_S390_VM_CLR_CMMA
Parameteres: none
Returns: 0

Clear the CMMA status for all guest pages, so any pages the guest marked
as unused are again used any may not be reclaimed by the host.
2 changes: 2 additions & 0 deletions Documentation/virtual/kvm/s390-diag.txt
Original file line number Diff line number Diff line change
Expand Up @@ -78,3 +78,5 @@ DIAGNOSE function code 'X'501 - KVM breakpoint

If the function code specifies 0x501, breakpoint functions may be performed.
This function code is handled by userspace.

This diagnose function code has no subfunctions and uses no parameters.
14 changes: 14 additions & 0 deletions arch/s390/include/asm/ctl_reg.h
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,20 @@ static inline void __ctl_clear_bit(unsigned int cr, unsigned int bit)
void smp_ctl_set_bit(int cr, int bit);
void smp_ctl_clear_bit(int cr, int bit);

union ctlreg0 {
unsigned long val;
struct {
#ifdef CONFIG_64BIT
unsigned long : 32;
#endif
unsigned long : 3;
unsigned long lap : 1; /* Low-address-protection control */
unsigned long : 4;
unsigned long edat : 1; /* Enhanced-DAT-enablement control */
unsigned long : 23;
};
};

#ifdef CONFIG_SMP
# define ctl_set_bit(cr, bit) smp_ctl_set_bit(cr, bit)
# define ctl_clear_bit(cr, bit) smp_ctl_clear_bit(cr, bit)
Expand Down
149 changes: 137 additions & 12 deletions arch/s390/include/asm/kvm_host.h
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,17 @@ struct sca_entry {
__u64 reserved2[2];
} __attribute__((packed));

union ipte_control {
unsigned long val;
struct {
unsigned long k : 1;
unsigned long kh : 31;
unsigned long kg : 32;
};
};

struct sca_block {
__u64 ipte_control;
union ipte_control ipte_control;
__u64 reserved[5];
__u64 mcn;
__u64 reserved2;
Expand Down Expand Up @@ -85,12 +93,26 @@ struct kvm_s390_sie_block {
__u8 reserved40[4]; /* 0x0040 */
#define LCTL_CR0 0x8000
#define LCTL_CR6 0x0200
#define LCTL_CR9 0x0040
#define LCTL_CR10 0x0020
#define LCTL_CR11 0x0010
#define LCTL_CR14 0x0002
__u16 lctl; /* 0x0044 */
__s16 icpua; /* 0x0046 */
#define ICTL_LPSW 0x00400000
#define ICTL_PINT 0x20000000
#define ICTL_LPSW 0x00400000
#define ICTL_STCTL 0x00040000
#define ICTL_ISKE 0x00004000
#define ICTL_SSKE 0x00002000
#define ICTL_RRBE 0x00001000
__u32 ictl; /* 0x0048 */
__u32 eca; /* 0x004c */
#define ICPT_INST 0x04
#define ICPT_PROGI 0x08
#define ICPT_INSTPROGI 0x0C
#define ICPT_OPEREXC 0x2C
#define ICPT_PARTEXEC 0x38
#define ICPT_IOINST 0x40
__u8 icptcode; /* 0x0050 */
__u8 reserved51; /* 0x0051 */
__u16 ihcpu; /* 0x0052 */
Expand All @@ -109,9 +131,21 @@ struct kvm_s390_sie_block {
psw_t gpsw; /* 0x0090 */
__u64 gg14; /* 0x00a0 */
__u64 gg15; /* 0x00a8 */
__u8 reservedb0[30]; /* 0x00b0 */
__u16 iprcc; /* 0x00ce */
__u8 reservedd0[48]; /* 0x00d0 */
__u8 reservedb0[28]; /* 0x00b0 */
__u16 pgmilc; /* 0x00cc */
__u16 iprcc; /* 0x00ce */
__u32 dxc; /* 0x00d0 */
__u16 mcn; /* 0x00d4 */
__u8 perc; /* 0x00d6 */
__u8 peratmid; /* 0x00d7 */
__u64 peraddr; /* 0x00d8 */
__u8 eai; /* 0x00e0 */
__u8 peraid; /* 0x00e1 */
__u8 oai; /* 0x00e2 */
__u8 armid; /* 0x00e3 */
__u8 reservede4[4]; /* 0x00e4 */
__u64 tecmc; /* 0x00e8 */
__u8 reservedf0[16]; /* 0x00f0 */
__u64 gcr[16]; /* 0x0100 */
__u64 gbea; /* 0x0180 */
__u8 reserved188[24]; /* 0x0188 */
Expand Down Expand Up @@ -146,6 +180,8 @@ struct kvm_vcpu_stat {
u32 exit_instruction;
u32 instruction_lctl;
u32 instruction_lctlg;
u32 instruction_stctl;
u32 instruction_stctg;
u32 exit_program_interruption;
u32 exit_instr_and_program;
u32 deliver_external_call;
Expand All @@ -164,6 +200,7 @@ struct kvm_vcpu_stat {
u32 instruction_stpx;
u32 instruction_stap;
u32 instruction_storage_key;
u32 instruction_ipte_interlock;
u32 instruction_stsch;
u32 instruction_chsc;
u32 instruction_stsi;
Expand All @@ -183,13 +220,58 @@ struct kvm_vcpu_stat {
u32 diagnose_9c;
};

#define PGM_OPERATION 0x01
#define PGM_PRIVILEGED_OP 0x02
#define PGM_EXECUTE 0x03
#define PGM_PROTECTION 0x04
#define PGM_ADDRESSING 0x05
#define PGM_SPECIFICATION 0x06
#define PGM_DATA 0x07
#define PGM_OPERATION 0x01
#define PGM_PRIVILEGED_OP 0x02
#define PGM_EXECUTE 0x03
#define PGM_PROTECTION 0x04
#define PGM_ADDRESSING 0x05
#define PGM_SPECIFICATION 0x06
#define PGM_DATA 0x07
#define PGM_FIXED_POINT_OVERFLOW 0x08
#define PGM_FIXED_POINT_DIVIDE 0x09
#define PGM_DECIMAL_OVERFLOW 0x0a
#define PGM_DECIMAL_DIVIDE 0x0b
#define PGM_HFP_EXPONENT_OVERFLOW 0x0c
#define PGM_HFP_EXPONENT_UNDERFLOW 0x0d
#define PGM_HFP_SIGNIFICANCE 0x0e
#define PGM_HFP_DIVIDE 0x0f
#define PGM_SEGMENT_TRANSLATION 0x10
#define PGM_PAGE_TRANSLATION 0x11
#define PGM_TRANSLATION_SPEC 0x12
#define PGM_SPECIAL_OPERATION 0x13
#define PGM_OPERAND 0x15
#define PGM_TRACE_TABEL 0x16
#define PGM_SPACE_SWITCH 0x1c
#define PGM_HFP_SQUARE_ROOT 0x1d
#define PGM_PC_TRANSLATION_SPEC 0x1f
#define PGM_AFX_TRANSLATION 0x20
#define PGM_ASX_TRANSLATION 0x21
#define PGM_LX_TRANSLATION 0x22
#define PGM_EX_TRANSLATION 0x23
#define PGM_PRIMARY_AUTHORITY 0x24
#define PGM_SECONDARY_AUTHORITY 0x25
#define PGM_LFX_TRANSLATION 0x26
#define PGM_LSX_TRANSLATION 0x27
#define PGM_ALET_SPECIFICATION 0x28
#define PGM_ALEN_TRANSLATION 0x29
#define PGM_ALE_SEQUENCE 0x2a
#define PGM_ASTE_VALIDITY 0x2b
#define PGM_ASTE_SEQUENCE 0x2c
#define PGM_EXTENDED_AUTHORITY 0x2d
#define PGM_LSTE_SEQUENCE 0x2e
#define PGM_ASTE_INSTANCE 0x2f
#define PGM_STACK_FULL 0x30
#define PGM_STACK_EMPTY 0x31
#define PGM_STACK_SPECIFICATION 0x32
#define PGM_STACK_TYPE 0x33
#define PGM_STACK_OPERATION 0x34
#define PGM_ASCE_TYPE 0x38
#define PGM_REGION_FIRST_TRANS 0x39
#define PGM_REGION_SECOND_TRANS 0x3a
#define PGM_REGION_THIRD_TRANS 0x3b
#define PGM_MONITOR 0x40
#define PGM_PER 0x80
#define PGM_CRYPTO_OPERATION 0x119

struct kvm_s390_interrupt_info {
struct list_head list;
Expand Down Expand Up @@ -229,6 +311,45 @@ struct kvm_s390_float_interrupt {
unsigned int irq_count;
};

struct kvm_hw_wp_info_arch {
unsigned long addr;
unsigned long phys_addr;
int len;
char *old_data;
};

struct kvm_hw_bp_info_arch {
unsigned long addr;
int len;
};

/*
* Only the upper 16 bits of kvm_guest_debug->control are arch specific.
* Further KVM_GUESTDBG flags which an be used from userspace can be found in
* arch/s390/include/uapi/asm/kvm.h
*/
#define KVM_GUESTDBG_EXIT_PENDING 0x10000000

#define guestdbg_enabled(vcpu) \
(vcpu->guest_debug & KVM_GUESTDBG_ENABLE)
#define guestdbg_sstep_enabled(vcpu) \
(vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
#define guestdbg_hw_bp_enabled(vcpu) \
(vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP)
#define guestdbg_exit_pending(vcpu) (guestdbg_enabled(vcpu) && \
(vcpu->guest_debug & KVM_GUESTDBG_EXIT_PENDING))

struct kvm_guestdbg_info_arch {
unsigned long cr0;
unsigned long cr9;
unsigned long cr10;
unsigned long cr11;
struct kvm_hw_bp_info_arch *hw_bp_info;
struct kvm_hw_wp_info_arch *hw_wp_info;
int nr_hw_bp;
int nr_hw_wp;
unsigned long last_bp;
};

struct kvm_vcpu_arch {
struct kvm_s390_sie_block *sie_block;
Expand All @@ -238,11 +359,13 @@ struct kvm_vcpu_arch {
struct kvm_s390_local_interrupt local_int;
struct hrtimer ckc_timer;
struct tasklet_struct tasklet;
struct kvm_s390_pgm_info pgm;
union {
struct cpuid cpu_id;
u64 stidp_data;
};
struct gmap *gmap;
struct kvm_guestdbg_info_arch guestdbg;
#define KVM_S390_PFAULT_TOKEN_INVALID (-1UL)
unsigned long pfault_token;
unsigned long pfault_select;
Expand Down Expand Up @@ -285,7 +408,9 @@ struct kvm_arch{
struct gmap *gmap;
int css_support;
int use_irqchip;
int use_cmma;
struct s390_io_adapter *adapters[MAX_S390_IO_ADAPTERS];
wait_queue_head_t ipte_wq;
};

#define KVM_HVA_ERR_BAD (-1UL)
Expand Down
10 changes: 6 additions & 4 deletions arch/s390/include/asm/lowcore.h
Original file line number Diff line number Diff line change
Expand Up @@ -56,13 +56,14 @@ struct _lowcore {
__u16 pgm_code; /* 0x008e */
__u32 trans_exc_code; /* 0x0090 */
__u16 mon_class_num; /* 0x0094 */
__u16 per_perc_atmid; /* 0x0096 */
__u8 per_code; /* 0x0096 */
__u8 per_atmid; /* 0x0097 */
__u32 per_address; /* 0x0098 */
__u32 monitor_code; /* 0x009c */
__u8 exc_access_id; /* 0x00a0 */
__u8 per_access_id; /* 0x00a1 */
__u8 op_access_id; /* 0x00a2 */
__u8 ar_access_id; /* 0x00a3 */
__u8 ar_mode_id; /* 0x00a3 */
__u8 pad_0x00a4[0x00b8-0x00a4]; /* 0x00a4 */
__u16 subchannel_id; /* 0x00b8 */
__u16 subchannel_nr; /* 0x00ba */
Expand Down Expand Up @@ -196,12 +197,13 @@ struct _lowcore {
__u16 pgm_code; /* 0x008e */
__u32 data_exc_code; /* 0x0090 */
__u16 mon_class_num; /* 0x0094 */
__u16 per_perc_atmid; /* 0x0096 */
__u8 per_code; /* 0x0096 */
__u8 per_atmid; /* 0x0097 */
__u64 per_address; /* 0x0098 */
__u8 exc_access_id; /* 0x00a0 */
__u8 per_access_id; /* 0x00a1 */
__u8 op_access_id; /* 0x00a2 */
__u8 ar_access_id; /* 0x00a3 */
__u8 ar_mode_id; /* 0x00a3 */
__u8 pad_0x00a4[0x00a8-0x00a4]; /* 0x00a4 */
__u64 trans_exc_code; /* 0x00a8 */
__u64 monitor_code; /* 0x00b0 */
Expand Down
2 changes: 2 additions & 0 deletions arch/s390/include/asm/mmu.h
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ typedef struct {
unsigned long vdso_base;
/* The mmu context has extended page tables. */
unsigned int has_pgste:1;
/* The mmu context uses storage keys. */
unsigned int use_skey:1;
} mm_context_t;

#define INIT_MM_CONTEXT(name) \
Expand Down
1 change: 1 addition & 0 deletions arch/s390/include/asm/mmu_context.h
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ static inline int init_new_context(struct task_struct *tsk,
mm->context.asce_bits |= _ASCE_TYPE_REGION3;
#endif
mm->context.has_pgste = 0;
mm->context.use_skey = 0;
mm->context.asce_limit = STACK_TOP_MAX;
crst_table_init((unsigned long *) mm->pgd, pgd_entry_type(mm));
return 0;
Expand Down
3 changes: 2 additions & 1 deletion arch/s390/include/asm/pgalloc.h
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,8 @@ unsigned long *page_table_alloc(struct mm_struct *, unsigned long);
void page_table_free(struct mm_struct *, unsigned long *);
void page_table_free_rcu(struct mmu_gather *, unsigned long *);

void page_table_reset_pgste(struct mm_struct *, unsigned long, unsigned long);
void page_table_reset_pgste(struct mm_struct *, unsigned long, unsigned long,
bool init_skey);
int set_guest_storage_key(struct mm_struct *mm, unsigned long addr,
unsigned long key, bool nq);

Expand Down
Loading

0 comments on commit 63b5cf0

Please sign in to comment.